Blog

March 5, 2019

The power of more than a million genomes

By Mengyao Hu

Tobacco and alcohol use are two critical risk factors for many health conditions, and important causes for mortality. In a new article published in Nature Genetics, SRC scientists Jessica Faul, Jennifer Smith, and David Weir in collaboration with a large team of researchers present innovative research examining the genetic etiology of tobacco and alcohol use based on data from up to 1.2 million individuals. This important research reflects a recent endeavor in bridging social science research with genome wide association studies (GWAS).

Variants in the human genome contribute to human phenotypes (traits and diseases).  With the development of genotyping technology, GWAS has been widely used in biomedical research aiming to find the genetic basis of human phenotype differences.  With the power of SNP microarrays, study samples of thousands of people can reveal aspects of the genetic basis of human disease. To date, GWAS have revealed risk loci associated with important diseases such as Alzheimer’s, Parkinson’s, multiple types of cancer, and many other traits and diseases. With the advance of next generation sequencing, GWAS studies will be boosted to the next level with better designs to target different ethnicities within the population.  In addition, the size of available genetic data has been increased exponentially with the establishment of national-level biological sample databanks like UK Biobank, Iceland’s deCODE genetics, and the genetic data now available for nationally-representative surveys including the National Longitudinal Study of Adolescent to Adult Health (Add Health) and the Health and Retirement Study (HRS), both included in the Nature Genetics article. This massive amount of genetic data allows researchers to seek the association between genetic variations and traits that were not necessarily considered to have strong genetic components. With this available population-based data, researchers are turning to investigation of the genetic relationship to respondents’ risk behaviors and the early development of diseases.

This particular study by these three SRC colleagues and their collaborators evaluated genetic data from multiple studies and biobanks, amounting to over 1.2 million individuals. They performed state-of-the-art genetic analysis and meta-analysis to examine the etiology of tobacco and alcohol use. With such big data, over 400 loci associated with the tobacco use and alcohol consumption were identified; a majority of these were previously unknown to have associations with these traits.  Identifying these genetic risk factors allows for the potential to better estimate a person’s risk for substance addiction using genetic information. It also shows that social science data and methodologies can be effectively intergrated with the population level GWAS method to identify new genetic risk loci. Life history may influence the penetrance of addiction associated risk alleles, and combining survey methodology with genetic data will help to identify these alleles. This will provide better prevention for addiction and promote the overall social welfare.


Liu, Mengzhen;…Faul, Jessica D.; …Smith, Jennifer A.; … Weir, David R.;…(2019). Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nature genetics.