The Division of Biostatistics at the Department of Preventive Medicine invites you to attend the following seminar.
Time: Monday, February 7, 2:00 PM-3:00 PM CDT
ZOOM Virtual Room Connection: Register in advance for this meeting
Speaker Dr. Arjun Krishnan, Michigan State University
Title: Democratizing data-driven biology: Tackling incomplete data, unstructured metadata, and hidden curricula
Abstract: There is much enthusiasm about using omics and biomedical data collections to fuel research on complex traits and diseases. However, there are still some well-known fundamental challenges in seamlessly and effectively using these data to drive research. For instance, there are >1.5 million human gene expression profiles that are publicly available, but, depending on the technology/platform used to record each profile, different subsets of genes in the genome are measured in these transcriptomes, leading to thousands of unmeasured genes in many of these profiles. These gaps in data are major hurdles for integrative analysis. Critical problems also exist with data descriptions: the majority of >2 million publicly available omics samples lack structured metadata, including information about tissue of origin, disease status, and environmental conditions. Thus, discovering samples and datasets of interest is not straightforward. In this seminar, I will present recent work from our group on developing machine learning approaches to address these fundamental challenges. In addition, I will discuss the need for improving advanced research training in biological data analysis by formalizing concepts in statistical procedures, study design, data/code management, critically consuming data-driven findings, and reproducible research.
We look forward to seeing you all among us.