Home Liver DiseasesAlcoholic Liver Disease ALDH2 genotype modulates the association between alcohol consumption and AST/ALT ratio among middle-aged Japanese men: a genome-wide G × E interaction analysis

# ALDH2 genotype modulates the association between alcohol consumption and AST/ALT ratio among middle-aged Japanese men: a genome-wide G × E interaction analysis

### Study subjects

The Tohoku Medical Megabank Community-Based Cohort (TMM CommCohort) study was designed and as previously described22. Briefly, 20–75-year-old residents from Iwate and Miyagi, which are the Pacific coast prefectures in Northeast Japan, were recruited between May 2013 and March 2016. To control for unmeasured biases, individuals from Miyagi and Iwate were treated as separate sub-cohorts.

Physiological, urine, and blood tests were conducted at the time of enrolment. The levels of GGT, AST, and ALT were measured using standardized clinical laboratory techniques based on the standard protocol of the Japan Society of Clinical Chemistry (JSCC)23.

The medical history and lifestyles, including drinking habits, of the enrolled subjects were documented using self-administered questionnaires. In the questionnaires, current drinking status was defined in four categories: “current drinker (drinking more than once in a month)”, “former drinker”, “never (or almost never) drinker”, and “never drinker because of his/her predisposition to rejecting alcohol.” In this study, we treated only “current drinker” as a drinker, and others (i.e. “former drinker,” “never (or almost never) drinker,” and “never drinker because of his/her predisposition to rejecting alcohol”) as a non-drinker. Drinking frequency (drinking opportunity in a week) was reported by 6 categories: “less than 1 day/month”, “1–3 days/month”, “1–2 days/week”, “3–4 days/week”, “5–6 days/week”, and “every day.” We converted the answers into numeric values: 0, 0.5, 1.5, 3.5, 5.5, and 7 days/week, respectively. Weekly alcohol consumption (WAC) was denoted as the sum of ethanol content (g) for each type of beverages drunk in a week. The ethanol content of each type of alcoholic beverage was considered as follows: 180 ml sake (rice wine) as 23 g, 180 ml shochu (white spirits) as 36 g, 180 ml of chu-hai (cocktail using shochu) as 12.96 g, 633 ml beer as 23 g, 30 ml whisky as 10 g, and 100 ml wine as 12 g24. The daily alcohol consumption (DAC) was calculated by dividing WAC by 7 days. The subjects were stratified by DAC into 5 tiers, based on standard US drinks (14 g alcohol)25: tier 0 (DAC (drinks/day) < 0.1); tier 1 (0.1 ≤ DAC < 1); tier 2 (1 ≤ DAC < 2); tier 3 (2 ≤ DAC < 3); tier 4 (3 ≤ DAC). We defined the alcohol consumption for non-drinkers as 0.

The study was approved by the Institutional Review Board of Iwate Medical University and Tohoku University. All participants provided written informed consent. This study was conducted according to the principles expressed in the Declaration of Helsinki.

### Genotyping and genotype imputation

The procedure of genotyping and genotype imputation was performed as previously described21,26,27. Briefly, 9966 participants in the TMM CommCohort study, enrolled in 2013, were genotyped using a HumanOmniExpressExome BeadChip Array (Illumina Inc., San Diego, CA, USA). Subjects compatible with the following criteria were excluded from analysis: low call rate (< 0.99), sex-mismatch between questionnaire and genotype data, non-Japanese ancestry, or one of a close kinship pair (PI_HAT > 0.1875). The imputation of information on sex and the identification of close kinship pairs were conducted using the PLINK version 1.90b5.3. Variants with a low call rate (< 0.95), low Hardy–Weinberg equilibrium exact test P-value (P < 1 × 10–6), or low minor allele frequency (MAF; < 0.01) were also excluded. As a result, 1,127 individuals were removed, and 8839 subjects and 594,037 autosomal variants remained. After phasing by the SHAPEIT28 version 2.r900, imputation was conducted by Minimac329 version 2.0.1 using the 1,000 Genomes reference panel phase 330 as a reference. Variants with low-imputation quality (R2 < 0.8) were excluded. Finally, the remaining 7,129,678 variants were applied for subsequent analyses.

### Genome-wide interaction analysis and meta-analysis

Subjects who did not provide information on BMI, age, sex, alcohol consumption, or LT, such as AST, ALT, and GGT levels, were excluded. Additionally, subjects who had LT levels outside a range between a mean ± fourfold of standard deviation (SD), or who had a liver illness, such as hepatitis B, hepatitis C, liver cancer, or fatty liver disease, were also excluded. As a result, 983 individuals were excluded, and 7856 individuals remained. To perform a linear regression, GGT, AST, and ALT were log-transformed.

We performed polymorphism × environment interaction analysis in the enrolled Miyagi and Iwate residents, respectively. The method of the interaction analysis was performed as previously described21. Briefly, we fitted a linear regression model using a null hypothesis (H0), which lacked an interaction term, and an alternative hypothesis (H1), including interaction term, as follow:

$${text{H}}0:{text{ Y }} = , beta_{0} + , beta_{{text{G}}} {text{G }} + , beta_{{text{E}}} {text{E,}}$$

$${text{H1}}:{text{ Y }} = , beta_{0} + , beta_{{text{G}}} {text{G }} + , beta_{{text{E}}} {text{E }} + , beta_{{{text{GE}}}} {text{G}} times {text{E,}}$$

where Y is LT (AST/ALT ratio, or log-transformed GGT, AST or ALT), G is genotype variable, E is DAC (g/day) variable, β0 is the intercept, βG is the coefficient for variable G, βE is the coefficient for variable E, and βGE is the coefficient for the interaction between G and E. The interaction analysis was adjusted for age, sex, BMI, and population structure in the genotype dataset (top 5 principal components [PCs] calculated using the PLINK software). The significance of the interaction term (βGE) was evaluated using the 1 df likelihood ratio test20.

The summaries of genome-wide interaction analysis for each prefectural population were applied for inverse-variant based meta-analysis using METAL (released on 2011-03-25)31. After genomic control correction, variants with Pmeta < 5 × 10–8 were considered as genome-wide significant.

### Replication analysis

For our replication study, we used the pre-imputed dataset released by the TMM32,33. Within this dataset, we used the subsets genotyped using Omni2.5 SNP array (Illumina Inc., San Diego, CA, USA) as well as the customized genotyping array designed by the TMM based on the Axiom platform (Thermo Fisher Scientific, Waltham, MA USA), denoted as Japonica array version 2 (JPAv2). The genotyped data were pre-phased using SHAPEIT version 2 r837 and imputed using IMPUTE2 version 2.2.2 and 2KJPN with an allele frequency panel of ~ 2000 Japanese individuals34,35,36. After conducting the same quality control with the main dataset, 4,935,024 and 5,686,147 variants in the JPAv2 and Omni2.5 datasets, respectively, were selected for further analysis.

Replication analysis was conducted using the same exclusion criteria as those for the main analysis. Additionally, we excluded the Miyagi population in the subjects genotyped by JPAv2, because of small sample size (n = 678). The all subjects in the dataset genotyped by Omni2.5 belonged in the Miyagi population. Ultimately, 2791 and 1597 individuals for the JPAv2 and Omni2.5 dataset, respectively, were selected for replication analysis.

### Power calculation

The power calculation was conducted as previously reported21. Briefly, we assumed that residuals of age-, sex-, and BMI-adjusted LT were distributed according to the following genetic model: LT = βEE + βG×E G × E, where variable E (alcohol consumption) was sampled from a normal distribution and variable G (genotype) was sampled according to assumed minor allele frequency (20% or 50%). The model parameters (βE) were estimated from the dataset used in the present study. βG×E was assumed to be 0.25- to 2.5-fold βE. We simulated data for E and G for the Iwate and Miyagi populations, performed inverse-variance weighted meta-analysis, and recorded whether the interaction term achieved suggestive significance. This process was repeated for 1000 iterations to calculate the power of each parameter set.

### Estimation of genetic correlation and LD score regression intercept

We estimated the genetic correlation and LD score regression intercept using LDSC version 1.0.137,38 and pre-computed LD scores for East Asians provided by the program developer. The LD score regression intercept was calculated using summary statistics. To estimate the genetic correlation between the LT traits, summary data from the published GWAS conducted in Japanese population were used19.

### Statistical analysis for polymorphism × environment interaction

The statistical analysis for the identified variants was conducted using R (version 3.5.1). To determine the trend in quantitative traits, the Junckheere-Terpstra test was performed using clinfun (version 1.0.15). To calculate the adjusted LT, we added the mean LT to the residual in the linear regression adjusted by age, sex, BMI and genotype, as described in a previous study21.

### Data availability

The datasets analyzed in the current study are not publicly available for ethical reasons. However, they can be made available upon request after approval from the Ethical Committee of Iwate Medical University, the Ethical Committee of Tohoku University, and the Materials and Information Distribution Review Committee of the TMM Project.