Artificial intelligence (AI) has steadily become a part of our everyday lives. Over the past several years, AI is increasingly being applied to biomedical research and in clinical care, however current methods being used in developing AI for biomedical research and clinical decisions may generate new health care disparities.
In the paper published recently in Nature Communications titled, “Deep Transfer Learning for Reducing Health Care Disparities Arising From Biomedical Data Inequality,” Yan Cui, PhD, associate professor in the UTHSC Department of Genetics, Genomics, and Informatics, explains the importance of developing unbiased AI models that work equally for all ethnic groups in an effort to prevent and reduce health care disparities. Yan Gao, PhD, a postdoctoral fellow in the Department of Genetics, Genomics, and Informatics, and member of Dr. Cui’s lab, is co-author of the paper.
“We hope our work will establish a new paradigm for machine learning with multi-ethnic data,” Dr. Cui said.
The paper presents data inequality among ethnic groups, an important challenge in the field of genomics. He said statistics show that over 90% of the current genomics data were collected from people of European ancestry, which represents only about 16 percent of the world’s population. “About 84 percent of the world’s population only have less than 10 percent of the genomics data. If build our artificial intelligence capacity for biomedical research and health care with current genomics data repository, it results in severe inequality. This means we would have less accurate AI models, or machine learning models, for most of the world population. So this is an important challenge we want to address in our work.”
The paper addresses two challenges and findings. One is that current machine learning schemes fail to address data inequality. Low performance models are generated unintentionally or even go unnoticed, which result on negative impacts in health care for data-disadvantaged ethnic groups. The second is that a machine learning technique called transfer learning, a technique used to transfer knowledge from one model to another, can be used in order to improve model performance, and reduce health care disparities for data-disadvantaged ethnic groups.
Deep learning models require a vast amount of data to in order to perform accurately and effectively. The increased representation of different ethnic groups in biomedical databases would result in improved health care outcomes particularly in the field of precision medicine. Precision medicine is becoming an increasingly powerful approach for targeted disease treatment and prevention, and heavily relies on biomedical data-driven treatment. Dr. Cui stresses the importance for people from across ethnic groups to participate in genomic research.
“The more genetic data we get from individuals and from populations, the more accurate our models can become,” said Robert Williams, PhD, UT-Oak Ridge National Laboratory Governor’s Chair in Computational Genomics and chair of the Department of Genetics, Genomics and Informatics. “Dr. Cui’s precautionary and innovative paper shows the incredible imbalance in the data sets that go into the machine learning methods, and we have to be aware of that. It’s early enough so that we can actually intervene. AI is not truly pervasive in your diagnosis or treatment, but we know in the future it’s going to happen.”
To read the paper, visit Nature Communications.