CRGGH Research

The Center for Research on Genomics and Global Health (CRGGH) uses genomic tools to understand the pathobiology of metabolic disorders, including obesity, hypertension, diabetes, dyslipidemia, and kidney diseases, in different human populations with particular attention to African Americans and Africans. Recent African origin populations, such as African Americans, provide unique opportunities to study how "old" genes interact with "new" environments in the evolution of common complex traits.

Overview

Taking advantage of the huge contrast in the distribution of risk factors at the genetic and environmental levels in contemporary African populations, the CRGGH is developing biological and statistical models to gain fundamental insights into disease etiology, differential distribution of disease, and variable drug response. The CRGGH is particularly interested in generating data that will allow novel inquiries into the evolutionary context of complex human diseases. For example, CRGGH investigators strive to understand how evolutionary forces such as pathogens, climate, and ancient food scarcity have shaped human genomes and how these genomic modifications predispose individuals and populations to chronic complex diseases such diabetes and cardiovascular disorders.

Data generated in the CRGGH will continue to inform discussions surrounding complicated issues such as health disparities and whether the high rate of diseases like diabetes, hypertension, and obesity among African Americans and other minority groups in the United States is the result of exposure to higher levels of environmental risk factors, an increased genetic susceptibility, or an interaction between adverse environments and deleterious variant load.

To answer these questions, the CRGGH is developing multiple genetic epidemiology projects in the United States, China, and several countries in Africa. View a summary description of our research projects below.

In addition, we maintain productive relationships with our collaborators and by participating in consortia.

Overview

Taking advantage of the huge contrast in the distribution of risk factors at the genetic and environmental levels in contemporary African populations, the CRGGH is developing biological and statistical models to gain fundamental insights into disease etiology, differential distribution of disease, and variable drug response. The CRGGH is particularly interested in generating data that will allow novel inquiries into the evolutionary context of complex human diseases. For example, CRGGH investigators strive to understand how evolutionary forces such as pathogens, climate, and ancient food scarcity have shaped human genomes and how these genomic modifications predispose individuals and populations to chronic complex diseases such diabetes and cardiovascular disorders.
Data generated in the CRGGH will continue to inform discussions surrounding complicated issues such as health disparities and whether the high rate of diseases like diabetes, hypertension, and obesity among African Americans and other minority groups in the United States is the result of exposure to higher levels of environmental risk factors, an increased genetic susceptibility, or an interaction between adverse environments and deleterious variant load.
To answer these questions, the CRGGH is developing multiple genetic epidemiology projects in the United States, China, and several countries in Africa. View a summary description of our research projects below.
In addition, we maintain productive relationships with our collaborators and by participating in consortia.

Publications

View a list of publications led by or in collaboration with the Center for Research on Genomics and Global Health.

Learn More

Our Lab

The role of the Center for Research on Genomics and Global Health (CRGGH) laboratory is to manage the center's large biorepository and to design and perform all key "wet-lab" experiments related to CRGGH activities. Located on the fifth floor of Building 50, Room 5531 (separate from the CRGGH office suite also on the NIH main campus), the laboratory is managed and maintained by Ayo Doumatey, Ph.D. and staffed by a research associate (Lin Lei) with expertise in chemistry and molecular biology as well as genetic and genomic techniques. The laboratory is also home to many of CRGGH's trainees.

While space is dedicated for CRGGH wet-lab activities, research projects also make use of a number of shared resources offered by NHGRI, including the Division of Intramural Research core instruments and services. Below are brief descriptions of just a few of the many cellular and molecular techniques currently employed by the CRGGH lab:

Bioassay/ Biomarker analysis

The CRGGH laboratory uses a number of technologies to measure the levels of selected biomolecules in serum (e.g., adipokines) or other biological fluids. The major methods include commercially available ELISA kits and the Bio-PlexÂ® Suspension Array System (Bio-RAD).

Genotyping

Customized genotyping using mid-throughput platform, namely the Sequenom® technology, which allows multiplexing for up to 32 SNPs in a single reaction across thousands of individual samples.
Drug metabolite array such as the Affymetrix DMET™ chip, which supports genotyping of about 1936 variations in 225 genes associated with the absorption, distribution, metabolism and excretion (ADME) of pharmaceutical drugs.

Gene expression

RT-PCR for genes of interest using Applied Biosystems chemistry on BIO-RAD instruments (MyiQ).
Whole genome gene expression array, such as the GeneChip® Human genome U133 array from Affymetrix.

DNA and RNA extraction from peripheral blood cells and multiple tissue sources

Automated extraction (EZ1 instrument and consumables from QIAGEN)
Manual extraction using commercially available kits.

Whole genome amplification (WGA)

The CRGGH laboratory also does a large number of sample preparations for clinical assays and ongoing collaborative activities.

Our Lab
The role of the Center for Research on Genomics and Global Health (CRGGH) laboratory is to manage the center's large biorepository and to design and perform all key "wet-lab" experiments related to CRGGH activities. Located on the fifth floor of Building 50, Room 5531 (separate from the CRGGH office suite also on the NIH main campus), the laboratory is managed and maintained by Ayo Doumatey, Ph.D. and staffed by a research associate (Lin Lei) with expertise in chemistry and molecular biology as well as genetic and genomic techniques. The laboratory is also home to many of CRGGH's trainees.
While space is dedicated for CRGGH wet-lab activities, research projects also make use of a number of shared resources offered by NHGRI, including the Division of Intramural Research core instruments and services. Below are brief descriptions of just a few of the many cellular and molecular techniques currently employed by the CRGGH lab:
Bioassay/ Biomarker analysis
The CRGGH laboratory uses a number of technologies to measure the levels of selected biomolecules in serum (e.g., adipokines) or other biological fluids. The major methods include commercially available ELISA kits and the Bio-PlexÂ® Suspension Array System (Bio-RAD).
Genotyping
Customized genotyping using mid-throughput platform, namely the Sequenom® technology, which allows multiplexing for up to 32 SNPs in a single reaction across thousands of individual samples.
Drug metabolite array such as the Affymetrix DMET™ chip, which supports genotyping of about 1936 variations in 225 genes associated with the absorption, distribution, metabolism and excretion (ADME) of pharmaceutical drugs.
Gene expression
RT-PCR for genes of interest using Applied Biosystems chemistry on BIO-RAD instruments (MyiQ).
Whole genome gene expression array, such as the GeneChip® Human genome U133 array from Affymetrix.
DNA and RNA extraction from peripheral blood cells and multiple tissue sources
Automated extraction (EZ1 instrument and consumables from QIAGEN)
Manual extraction using commercially available kits.
Whole genome amplification (WGA)
The CRGGH laboratory also does a large number of sample preparations for clinical assays and ongoing collaborative activities.

Research Resources

Genome-wide Summary Statistics

We provide a text file (GZ) containing results of a variance QTL scan of 16,503,295 markers for body mass index in 22,805 African Americans. We report the p-values from the Fligner-Killeen test of homogeneity of variance.

Shriner D, Bentley AR, Doumatey AP, Zhou J, Chen G, Rotimi CN, Adeyemo AA. Three loci affecting variance of body mass index in African Americans and sub-Saharan Africans.

Epidemiologic Cohorts

The CRGGH has developed multiple genetic epidemiology projects in the United States, Africa (Nigeria, Ghana, Kenya, and Ethiopia), and China. As a result of the successful implementation of these multi-national and multi-institutional collaborations, we currently have clinical/phenotypic data on over 10,000 people and ~70,000 biological (plasma, serum, DNA, urine and tissue) samples.

Howard University Family Study (HUFS)

The Howard University Family Study is a genetic epidemiology study of African Americans from the Washington DC metropolitan area. Phase I of the HUFS aimed to enroll a randomly ascertained population-based cohort of 350 African American (AA) families with a minimum of five people per family. Families were not selected based on any phenotype, making it possible to investigate the genetic and environmental basis of multiple traits. Phase II enrolled unrelated African Americans (AAs) from the same communities to facilitate the conduct of genome-wide association studies. Data generated from this cohort includes: multiple cardiometabolic phenotypes, genotypes from the Affymetrix Genome-Wide Human SNP 6.0 array, exome array genotypes, targeted dense SNP genotypes on selected genomic regions, and whole exome sequence data.

The proteomics data were obtained using a shotgun approach.

MHO n16553 Data Resource (ZIP)

A description of methods used to generate the data can be found in: Doumatey et al: "Pro-Inflammatory and lipid biomarkers mediate the metabolically healthy obesity phenotype: A shotgun proteomics Study. Obesity 2016 (accepted for publication)"

Rows correspond to distinct peptides.
Columns correspond to peptide counts in different experimental samples.
20 experimental samples in all:

10 MHO cases: MHO-01 through MHO-10
10 MAO controls: MAO-01 through MAO-10

Sample type: serum

The Africa America Diabetes Mellitus (AADM) Study

The Africa America Diabetes Mellitus (AADM- pronounced Adam) is the longest running genetic epidemiology study of type 2 diabetes in Africa. The purpose of this project is to map Type 2 Diabetes (T2D) genes in West Africa, the geographical origin of most African Americans. The initial phase of the project enrolled 991 individuals with T2D (400 affected sib-pairs and 191 controls) from two centers in Ghana and three centers in Nigeria for genome-wide linkage analysis. Subsequent phases supported the recruitment of extended pedigrees of the affected sibling pairs as well as controls resulting in a larger (n > 3,600) sample size. The resources that have been generated on this sample includes: multiple cardiometabolic phenotypes, genotypes from the Affymetrix Axiom PanAFR array, exome array genotypes from the Affymetrix Exome 319 array, genome-wide and targeted microsatellite (STR) data, dense genotypes on selected genomic regions, and whole exome sequence data.

The Genetics of T2D and Related Complications in China

This genetic epidemiology study of T2D project in China was designed to enroll well-characterized T2D cases and controls in Suizhou, China. Approximately 1,500 cases of T2D and 1,500 controls have been enrolled and characterized for our standard panel of clinical and anthropometric variables, laboratory assays on multiple biochemical parameters and diabetes-related complications. Exome array data has been generated on this genotype.

Software/Code and Other Resources

SCARVAsnp

SCARVAsnp is a C language program developed to detect rare variance within a given region. Briefly, it is implemented in four stages. First, all common variants in a pre-specified region (e.g., gene) are evaluated individually. Second, a union procedure is used to combine all rare variants (RVs) in the index region, and the ratio of the log likelihood with one RV excluded to the log likelihood of a model with all the collapsed RVs is calculated. On the basis of previously-reported simulation studies, a likelihood ratio ≥ 1.3 is considered statistically significant. Third, the direction of the association of the removed RV is determined by evaluating the change in λ values with the inclusion and exclusion of that RV. Lastly, significant common and rare variants, along with covariates, are included in a final regression model to evaluate the association between the trait and variants in that region. The program executable and readme and sample files are available here (ZIP)

Chen G, Yuan A, Zhou Y, Bentley AR, Zhou J, Chen W, Shriner D, Adeyemo A, Rotimi CN. Simultaneous Analysis of Common and Rare Variants in Complex Traits: Application to SNPs (SCARVAsnp). Bioinform Biol Insights, 6:177-185. 2012. [PubMed]

Corrected Tracy-Widom Test

This publication presents a correction to the Tracy-Widom test for population structure. A function implementing the corrected Tracy-Widom test (ZIP) authored by Daniel Shriner, Ph.D. (CRGGH) for the R environment is available.

Shriner D. Improved Eigenanalysis of Discrete Subpopulations and Admixture Using the Minimum Average Partial Test. Hum Hered, 73:73-83. 2012. [PubMed]

Ancestry

Shriner D, Tekola-Ayele F, Adeyemo A, Rotimi CN. Genome-wide genotype and sequence-based reconstruction of the 140,000 year history of modern human ancestry. Sci Rep 4:6055. 2014. [PubMed]

Ancestry_2014.bed
Ancestry_2014.bim
Ancestry_2014.fam
Ancestry_2014_cluster.txt

Download All (ZIP)

Projection Analysis

We provide a reference panel (ZIP) of ancestral allele frequencies for projection analysis as implemented by the software ADMIXTURE, version 1.3. The positions and marker names are consistent with the 1000 Genomes Project, phase 3 and the GRCh37 assembly. All frequencies refer to allele A2.

Baker JL, Rotimi CN, Shriner D. Human ancestry correlates with language and reveals that race is not an objective genomic classifier.

Research Resources
Genome-wide Summary Statistics
We provide a text file (GZ) containing results of a variance QTL scan of 16,503,295 markers for body mass index in 22,805 African Americans. We report the p-values from the Fligner-Killeen test of homogeneity of variance.
Shriner D, Bentley AR, Doumatey AP, Zhou J, Chen G, Rotimi CN, Adeyemo AA. Three loci affecting variance of body mass index in African Americans and sub-Saharan Africans.

Epidemiologic Cohorts
The CRGGH has developed multiple genetic epidemiology projects in the United States, Africa (Nigeria, Ghana, Kenya, and Ethiopia), and China. As a result of the successful implementation of these multi-national and multi-institutional collaborations, we currently have clinical/phenotypic data on over 10,000 people and ~70,000 biological (plasma, serum, DNA, urine and tissue) samples.

Howard University Family Study (HUFS)
The Howard University Family Study is a genetic epidemiology study of African Americans from the Washington DC metropolitan area. Phase I of the HUFS aimed to enroll a randomly ascertained population-based cohort of 350 African American (AA) families with a minimum of five people per family. Families were not selected based on any phenotype, making it possible to investigate the genetic and environmental basis of multiple traits. Phase II enrolled unrelated African Americans (AAs) from the same communities to facilitate the conduct of genome-wide association studies. Data generated from this cohort includes: multiple cardiometabolic phenotypes, genotypes from the Affymetrix Genome-Wide Human SNP 6.0 array, exome array genotypes, targeted dense SNP genotypes on selected genomic regions, and whole exome sequence data.
The proteomics data were obtained using a shotgun approach.
MHO n16553 Data Resource (ZIP)
A description of methods used to generate the data can be found in: Doumatey et al: "Pro-Inflammatory and lipid biomarkers mediate the metabolically healthy obesity phenotype: A shotgun proteomics Study. Obesity 2016 (accepted for publication)"
Rows correspond to distinct peptides.
Columns correspond to peptide counts in different experimental samples.
20 experimental samples in all:
10 MHO cases: MHO-01 through MHO-10
10 MAO controls: MAO-01 through MAO-10
Sample type: serum
The Africa America Diabetes Mellitus (AADM) Study
The Africa America Diabetes Mellitus (AADM- pronounced Adam) is the longest running genetic epidemiology study of type 2 diabetes in Africa. The purpose of this project is to map Type 2 Diabetes (T2D) genes in West Africa, the geographical origin of most African Americans. The initial phase of the project enrolled 991 individuals with T2D (400 affected sib-pairs and 191 controls) from two centers in Ghana and three centers in Nigeria for genome-wide linkage analysis. Subsequent phases supported the recruitment of extended pedigrees of the affected sibling pairs as well as controls resulting in a larger (n > 3,600) sample size. The resources that have been generated on this sample includes: multiple cardiometabolic phenotypes, genotypes from the Affymetrix Axiom PanAFR array, exome array genotypes from the Affymetrix Exome 319 array, genome-wide and targeted microsatellite (STR) data, dense genotypes on selected genomic regions, and whole exome sequence data.
The Genetics of T2D and Related Complications in China
This genetic epidemiology study of T2D project in China was designed to enroll well-characterized T2D cases and controls in Suizhou, China. Approximately 1,500 cases of T2D and 1,500 controls have been enrolled and characterized for our standard panel of clinical and anthropometric variables, laboratory assays on multiple biochemical parameters and diabetes-related complications. Exome array data has been generated on this genotype.
Software/Code and Other Resources
SCARVAsnp
SCARVAsnp is a C language program developed to detect rare variance within a given region. Briefly, it is implemented in four stages. First, all common variants in a pre-specified region (e.g., gene) are evaluated individually. Second, a union procedure is used to combine all rare variants (RVs) in the index region, and the ratio of the log likelihood with one RV excluded to the log likelihood of a model with all the collapsed RVs is calculated. On the basis of previously-reported simulation studies, a likelihood ratio ≥ 1.3 is considered statistically significant. Third, the direction of the association of the removed RV is determined by evaluating the change in λ values with the inclusion and exclusion of that RV. Lastly, significant common and rare variants, along with covariates, are included in a final regression model to evaluate the association between the trait and variants in that region. The program executable and readme and sample files are available here (ZIP)
Chen G, Yuan A, Zhou Y, Bentley AR, Zhou J, Chen W, Shriner D, Adeyemo A, Rotimi CN. Simultaneous Analysis of Common and Rare Variants in Complex Traits: Application to SNPs (SCARVAsnp). Bioinform Biol Insights, 6:177-185. 2012. [PubMed]
Corrected Tracy-Widom Test
This publication presents a correction to the Tracy-Widom test for population structure. A function implementing the corrected Tracy-Widom test (ZIP) authored by Daniel Shriner, Ph.D. (CRGGH) for the R environment is available.
Shriner D. Improved Eigenanalysis of Discrete Subpopulations and Admixture Using the Minimum Average Partial Test. Hum Hered, 73:73-83. 2012. [PubMed]
Ancestry
Shriner D, Tekola-Ayele F, Adeyemo A, Rotimi CN. Genome-wide genotype and sequence-based reconstruction of the 140,000 year history of modern human ancestry. Sci Rep 4:6055. 2014. [PubMed]
Ancestry_2014.bed
Ancestry_2014.bim
Ancestry_2014.fam
Ancestry_2014_cluster.txt
Download All (ZIP)
Projection Analysis
We provide a reference panel (ZIP) of ancestral allele frequencies for projection analysis as implemented by the software ADMIXTURE, version 1.3. The positions and marker names are consistent with the 1000 Genomes Project, phase 3 and the GRCh37 assembly. All frequencies refer to allele A2.
Baker JL, Rotimi CN, Shriner D. Human ancestry correlates with language and reveals that race is not an objective genomic classifier.

Last updated: March 4, 2024