Machine Learning for Personalized Medicine

Marie-Curie Action: "Initial Training Networks"

Modeling molecular heterogeneity between individuals and single cells

by Oliver Stegle

The analysis of large-scale expression datasets is often compromised by hidden structure between samples. In the context of genetic association studies, this structure can be linked to differences between individuals, which can reflect their genetic makeup (such as population structure) or be traced back to environmental and technical factors. In this talk, I will discuss statistical methods to reconstruct this structure from the observed data in order to account for it in genetic analyses. These approaches allow to accurately map the genetic determinants of high-dimensional molecular traits, including gene expression levels. By incorporating principles from causal reasoning, we show that critical pitfalls of falsely explaining away true biological signals can be effectively circumvented. In addition to applications in genetics, I will give an outlook how very similar approaches can be applied to model heterogeneity in single-cell transcriptome datasets.

>back to "Talks and Speakers"