Diagnosis and treatment in psychiatry still rely almost exclusively on a phenotype-based approach. Although this renders the validity of psychiatric classification questionable and severely hampers biomarker discovery, establishing a system of biologically meaningful groups - termed biotypes - has yet remained elusive . Building on the large-scale dataset acquired in this FOR, it is now finally possible to apply state-of-the-art tools from the fields of machine learning and multivariate statistics to a rich, multimodal database, thereby bringing robust biotype identification across disorders within reach. In the first funding period, WP6 established essential quality control protocols and performed statistical analyses of gene-environment interactions in both human samples (PsychChip x childhood maltreatment on hippocampus volume, WP1 and WP5) and Cacna1c knock-out rats (WP2, interaction of genotype with social isolation on miRNA expression (WP3), mitochondrial perturbation (WP4) and immunologic phenotypes (WP2, CP1)). Importantly, the two new PIs of WP6 (TH, BM) have implemented a machine learning workflow for whole-genome and neuroimaging analysis of data from WP1 and WP5, with first robust results. The WP6 PIs jointly aim to identify and validate biologically plausible, homogeneous biotypes within and across levels of observation - from molecular genetics to whole-brain neuroimaging. Drawing on data from all other WPs, we will 1) employ domain-knowledge-based and automatic feature engineering including Deep Autoencoders and Rectified Factor Networks to address the Curse of Dimensionality, 2) develop a principled approach to confounder removal in linear and non-linear multivariate models using e.g. modality-specific Adversarial Autoencoders, and 3) ensure reliability and internal validity by optimizing cluster solution stability through ensemble learning within and across levels of observation. Crucially, the longitudinal design of this FOR allows assessing the validity of our cluster solutions based on predictive utility. Specifically, predictive biomarker models for disease trajectory and outcome will be trained using our cluster solutions. We expect them to substantially outperform the same models trained on DSM-IV disorder groups. In addition, we expect these models to be robust against common confounders such as scanner site. Within this framework, we are uniquely positioned to uncover biologically plausible patient clusters across disorders, which will strengthen the validity of psychiatric classification and crucially simplify biomarker discovery in the future. Further, our collaborations with other consortia will allow us to rigorously test reproducibility in several large, independent samples, thereby ensuring validity and maximum impact of our results on the field. Finally, WP6 will provide the custom-tailored, scalable analytic tools, particularly in the area of machine learning and multivariate statistics, required in all other WPs.
Hahn, Tim | Institute of Translational Psychiatry |
Hahn, Tim | Institute of Translational Psychiatry |
Leenings, Ramona | Institute of Translational Psychiatry |
Winter, Nils | Institute of Translational Psychiatry |