Microbiome – Challenges and Opportunities

Morten Rasmussen

With modern genomic techniques it is now possible to measure the ecological composition of for instance the gut, the skin or the airways based on sequencing of amplified marker genes or assembly of sequences from the entire genome. In contrast to somewhat normal chemometric datasets, these data are counts, inherently compositional and further very sparse. This naturally introduces challenges in the initial quality control of the data and further in the statistical modeling. Sequencing of the genomic markup coupled with structured databases makes it possible not only to name individual bacteria, but further also to infer which enzyme they produce and which biochemical pathways that are present. This structural knowledge makes it possible to pursue integration of e.g. microbiome and metabolomics data via a bioinformatics database driven angle as opposed to a data driven chemometrics one.

This talk is going to introduce how microbiome data is obtained and point at some challenges in this regard. The structural knowledge obtained via sequencing opens an avenue of possibilities in terms of data modeling. Specifically, we will revisit common multivariate chemometric techniques such as Canonical Correlation Analysis for integration of microbiome and metabolomics data, and use the structural knowledge to softly enforce certain conditions on the model.

Mini-CV:

Morten Rasmussen is Associate Professor at the University of Copenhagen. His research focuses on development- and application of data driven mathematical and statistical methods for the modeling of complex biological systems with special emphasis on metabolomics and the microbiome within the area of clinical epidemiology. He is the holder of the 1st Bruce Kowalski Award in Chemometrics, which was awarded to him in 2014. He has published over 50 papers in subject areas like multivariate data analysis, chemometrics and systems biology.