Mapping the structure of genetic risk for common disease in the UK Biobank
Genetic risk factors frequently affect multiple common human diseases, providing insight into shared pathophysiological pathways and opportunities for therapeutic development. However, systematic identification of genetic profiles of disease risk is limited by the availability of both comprehensive clinical data on population-scale cohorts and the lack of suitable statistical methodology that can handle the scale of and differential power inherent in multi-phenotype data. We have developed a disease-agnostic approach to cluster genetic risk profiles for 3,025 genome-wide independent loci across 19,155 ICD-10 diagnostic codes from 320,644 participants in the UK Biobank, representing a large and heterogeneous population. We identify several hundred distinct disease association profiles and use multiple approaches to link clusters to underlying biological pathways. We show how clusters can decompose the variance and covariance in risk for disease, thereby identifying underlying biological processes and their impact. We demonstrate the use of clusters in defining disease relationships and informing therapeutic strategies.
Gil McVean is Professor of Statistical Genetics at the University of Oxford and Director of Oxford's Big Data Institute within the Li Ka Shing Centre for Health Information and Discovery (www.bdi.ox.ac.uk). His research focuses on understanding the molecular and evolutionary processes that shape genetic variation in populations and the relationship between genetic variation and phenotype. He has played a leading role in the HapMap and 1000 Genomes Projects, is co-founder of Genomics plc (www.genomicsplc.com), and currently works on organisms from HIV to malaria.