Genomic analysis of lean individuals with NAFLD identifies monogenic disorders in a prospective cohort study

Background & Aims Lean patients with non-alcoholic fatty liver disease (NAFLD) represent 10–20% of the affected population and may have heterogeneous drivers of disease. We have recently proposed the evaluation of patients with lean NAFLD without visceral adiposity for rare monogenic drivers of disease. Here, we aimed to validate this framework in a well-characterised cohort of patients with biopsy-proven NAFLD by performing whole exome sequencing. Methods This prospective study included 124 patients with biopsy-proven NAFLD and paired liver biopsies who underwent standardised research visits including advanced magnetic resonance imaging (MRI) assessment of liver fat and stiffness. Results Six patients with lean NAFLD were identified and underwent whole exome sequencing. Two lean patients (33%) were identified to have monogenic disorders. The lean patients with monogenic disorders had similar age, and anthropometric and MRI characteristics to lean patients without a monogenic disorder. Patient 1 harbours a rare homozygous pathogenic mutation in ALDOB (aldolase B) and was diagnosed with hereditary fructose intolerance. Patient 2 harbours a rare heterozygous mutation in apolipoprotein B (APOB). The pathogenicity of this APOB variant (p.Val1856CysfsTer2) was further validated in the UK Biobank and associated with lower circulating APOB levels (beta = -0.51 g/L, 95% CI -0.65 to -0.36 g/L, p = 1.4 × 10-11) and higher liver fat on MRI (beta = +10.4%, 95% CI 4.3–16.5%, p = 8.8 × 10-4). Hence, patient 2 was diagnosed with heterozygous familial hypobetalipoproteinaemia. Conclusions In this cohort of well-characterised patients with lean NAFLD without visceral adiposity, 33% (2/6) had rare monogenic drivers of disease, highlighting the importance of genomic analysis in this NAFLD subtype. Impact and Implications Although most people with non-alcoholic fatty liver disease (NAFLD) are overweight or obese, a subset are lean and may have unique genetic mutations that cause their fatty liver disease. We show that 33% of study participants with NAFLD who were lean harboured unique mutations that cause their fatty liver, and that these mutations had effects beyond the liver. This study demonstrates the value of genetic assessment of NAFLD in lean individuals to identify distinct subtypes of disease.


Introduction
The prevalence of non-alcoholic fatty liver disease (NAFLD) continues to grow, affecting an estimated 60-80 million people in the United States, and a subset of patients will progress to advanced liver disease including cirrhosis and hepatocellular carcinoma. 1 Patients with NAFLD who are lean represent 10-20% of the total population, and observational studies have yielded conflicting results with regard to disease severity and prognosis, 2,3 which may be related to more heterogeneous drivers of disease.
A previous study demonstrating the clinical utility of genomic analysis in the diagnosis and management of adult patients with liver disease of unknown aetiology revealed previously unappreciated monogenic disorders in three non-obese patients with NAFLD. 4 This finding led us to propose a framework for genomic evaluation of lean patients with NAFLD who lack visceral adiposity, to identify rare genetic variants that may have therapeutic implications and elucidate additional pathogenic mechanisms. 5 Here, we performed whole exome sequencing (WES) on lean individuals from a well-phenotyped longitudinal cohort with biopsy-proven NAFLD to evaluate for rare, monogenic drivers of disease.

Patients and methods
This is a longitudinal study derived from a well-characterised prospective cohort of patients with biopsy-proven NAFLD and paired liver biopsies who underwent a standard research visit that included history, physical examination, biochemical testing, and paired liver biopsy assessment (using the Non-alcoholic Steatohepatitis Clinical Research Network histologic scoring system) at the University of California San Diego (UCSD) NAFLD Research Center from 2006 through 2019. All patients provided written informed consent before enrolling in the study, and the study was approved by the UCSD Institutional Review Board. At baseline, all patients underwent a standardised clinical evaluation including detailed history, anthropometric exam, and laboratory testing at the UCSD NAFLD Research Center. Patients > − 18 years of age with biopsy-proven NAFLD were included and were identified as lean NAFLD by BMI < − 25 kg/m 2 for non-Asians and < − 23 kg/m 2 for Asians. Germline DNA was extracted from blood samples using standard methods. Germline DNA was captured using xGen exome V2 exome enrichment probes (Integrated DNA Technologies Coralville, Iowa) and sequenced using the Illumina NovaSeq platform (San Diego, California). The apolipoprotein B (APOB) rare variant validation was performed in the UK Biobank. All analyses were performed using R 3.6.0 (R Foundation for Statistical Computing, Vienna, Austria). Additional details are provided in the Supplementary information.

Results
Of 124 participants with biopsy-proven NAFLD who had longitudinal follow-up, 6 six patients with lean NAFLD, defined as a BMI < − 25 kg/m 2 for non-Asians and < − 23 kg/m 2 for Asians, were identified. Lean and non-lean participants were similar with regard to age, sex, diabetes status, laboratory parameters, liver histology, and magnetic resonance imaging proton density fat fraction (MRI-PDFF) and magnetic resonance elastography (MRE) ( Table S1). There was no difference in longitudinal change in histology, MRI-PDFF, or MRE between patients with lean NAFLD and those with non-lean NAFLD (Table S2). Six participants with lean NAFLD underwent whole exome sequencing (WES) of germline DNA (Table S3). Using the WES analysis pipeline (Fig. S1), we identified a monogenic disorder in two of these adult lean individuals with NAFLD (Table S4). Lean patients with monogenic disorders were of similar age and BMI and had similar fasting insulin levels to lean patients without an identified monogenic disorder ( Table 1). None of the patients in this study with biopsy-proven lean NAFLD harboured the protective rare variant in cell death-inducing DFFA-like effector B.
Diagnosis of hereditary fructose intolerance in patient 1 Patient 1 had a BMI of 21.3 kg/m 2 , with biopsy-proven nonalcoholic steatohepatitis with stage 2 fibrosis and MRI-PDFF of 24% consistent with severe steatosis. She was found to harbour a rare homozygous missense variant (chr9:104189856; C>G; p.Ala150Pro) in ALDOB, which encodes aldolase B. Biallelic variants in this gene cause hereditary fructose intolerance. Aldolase B is the enzyme responsible for catalysing fructose 1,6bisphosphate into glyceraldehyde 3-phosphate and dihydroxyacetone phosphate, and of fructose 1-phosphate into glyceraldehyde and dihydroxyacetone phosphate. Given the toxic metabolite accumulation as a result of the ingestion of fructose or sucrose, affected patients may present with hypoglycaemia, hepatic steatosis, and proximal renal tubulopathy. 7 This variant was predicted to be damaging by the additional in silico prediction models MetaSVM, SIFT, and PolyPhen-2 and has been reported as pathogenic in the ClinVar National Center for Biotechnology Information database. Experimental studies have shown that this missense mutation reduces substrate affinity and enzyme stability and activity within aldolase B. 8 This variant in homozygosity or compound heterozygosity has been described in individuals affected with hereditary fructose intolerance. This patient reported nausea, abdominal pain, and hypoglycaemia exacerbated by fruit intake consistent with hereditary fructose intolerance. The patient had no family history of hereditary fructose intolerance in her first-degree relatives.
Diagnosis of FHBL in patient 2 Patient 2 had a BMI of 24.96 kg/m 2 , with biopsy-proven nonalcoholic steatohepatitis with stage 3 fibrosis and MRI-PDFF of 14%. WES of germline DNA from patient 2 revealed a heterozygous frameshift variant (chr2:21234172, AAC>A; p.Val1856Cysf-sTer2) in APOB. APOB is the primary apolipoprotein of chylomicrons and VLDL, IDL, and LDL particles. 9 Familial hypobetalipoproteinaemia (FHBL) presents with low circulating lipid levels and increased hepatic steatosis. 4,5 This frameshift variant has been reported as pathogenic in the ClinVar National Center for Biotechnology Information (NCBI) database and previously associated with FHBL, but it has also been annotated as likely benign in the ClinVar NCBI database. Hence, we went back to the patient to perform genotype-phenotype correlation. 4,10 In addition to hepatic steatosis, the patient had low circulating lipid levels (LDL = 47 mg/dl, total cholesterol = 102 mg/dl, and triglycerides = 66 mg/dl) The patient's APOB level was evaluated and was low at 39 mg/dl. She had no family history of hypobetalipoproteinaemia in first-degree relatives. We next studied the first 200,643 exome-sequenced participants of the UK Biobank to better characterise the clinical significance of the p.Val1856CysfsTer2 variant in APOB. 11 Following a previously described genetic and sample quality control pipeline, 12 we identified 14 (0.007%) heterozygous carriers of p.Val1856Cysf-sTer2, two of whom returned for a follow-up imaging visit for MRI-derived liver fat measurement. Carriers of p.Val1856Cysf-sTer2 had lower APOB levels (beta = -0.51 g/L, 95% CI -0.65 to -0.36 g/L, p = 1.4 × 10 -11 ), higher liver fat (beta = +10.4%, 95% CI 4.3-16.5%, p = 8.8 × 10 -4 ), and a trend toward higher alanine aminotransferase (ALT) (beta =+6.7 U/L, 95% CI -0.4 to 13.8 U/L, p = 0.07). We next studied participants who carried loss-offunction transcript effect estimator (LOFTEE)-derived highconfidence predicted loss-of-function (LOFHC) variants in APOB, excluding p.Val1856CysfsTer2. We observed 280 heterozygote carriers of LOFHC variants in APOB (21 with liver imaging) across 104 variants, all with minor allele frequency less than 0.01%. Associations with ALT, APOB, and liver fat in these carriers were comparable with those of the p.Val1856CysfsTer2 variant (Fig. 1). In addition, UK Biobank participants with low serum APOB levels were more likely to harbour LOFHC variants in APOB and microsomal triglyceride transfer protein (MTTP) (Table S5). Finally, we evaluated the interaction between BMI and APOB variants combining p.Val1856CysfsTer2 with all other LOFHC variants in APOB and found a positive interaction term for both ALT (p = 0.008) and liver fat (p = 6.3 × 10 -5 ), suggesting that higher BMI amplifies the impact of the studied rare APOB variants on pathologic liver traits. Altogether, genotype and phenotype findings, as well as external validation of the impact of this rare variant, support that this patient has autosomal dominant APOB-related FHBL.
Evaluation of known common variants associated with NAFLD Given that four patients did not harbour a rare mutation, yet had lean NAFLD, we evaluated common variants associated with NAFLD and fibrosis (PNPLA3 rs738409:p.I148M, GCKR rs1260326:p.P446L, TM6SF2 rs58542926:C/T, and MBOAT7-TMC4 rs641738:C/T) and the protective variant HSD17B13 rs72613567:T/TA. Patient 1, who harboured the rare variant in ALDOB, also was heterozygous for the GCKR and MBOAT7 variants, but otherwise wild type for the evaluated single nucleotide polymorphisms. Patient 2, who harboured the rare variant in APOB, was homozygous for the risk variant in PNPLA3 and heterozygous for the variant in GCKR and MBOAT7. The six lean patients with NAFLD analysed in this cohort were wild type for the risk allele in TM6SF2 or the protective variants in HSD17B13 (Table S6). Although patient 6 did not ultimately have a causative variant found on WES analysis, they were found to be homozygous for both the PNPLA3 and GCKR polymorphisms and heterozygous for the MBOAT7 polymorphisms. Polygenic risk scores incorporating the five variants, calculated as previously described, 13,14 varied among the six patients from 0.063 to 0.725.

Discussion
This study supports the use of WES in the diagnosis and management of lean patients with NAFLD. Two out of 6 patients (33%) with NAFLD without visceral adiposity were discovered to harbour genetic diseases that explain the underlying   pathogenesis of their hepatic steatosis. Furthermore, we validated the pathogenicity of the p.Val1856CysfsTer2 variant in APOB using MRI quantification of liver fat and APOB levels. In the UK Biobank, we found a significant BMI-rare variant interaction on ALT and liver fat, which suggests that adiposity may amplify the effect of rare variants on fatty liver. This finding parallels what has been previously demonstrated for common variants associated with NAFLD. 15 In a prior study demonstrating the clinical utility of genomic analysis in the diagnosis and management of adults with unexplained liver disease, three out of six non-obese patients with hepatic steatosis in the absence of metabolic syndrome were found to harbour monogenic disorders underlying the triglyceride accumulation seen on hepatocytes. 4 Subsequently, we have proposed the incorporation of genomic analysis in a variety of liver diseases that remain unexplained despite a comprehensive work-up, 10,16 and in a recent review, we proposed a framework for evaluating patients with lean NAFLD. Although patients with lean NAFLD with increased visceral adiposity likely resemble the broader population with NAFLD, those without visceral adiposity may harbour rare monogenic variants that lead to a phenotype mimicking NAFLD. This study applies the proposed framework 5 to a well-phenotyped cohort of patients with biopsy-proven NAFLD and demonstrates the utility of evaluating patients with lean NAFLD without visceral adiposity for monogenic disorders with a diagnostic yield of 33%. Leveraging the UK Biobank data, we confirmed the pathogenicity of the p.Val1856CysfsTer2 variant in APOB and demonstrated an interaction between rare variants in APOB and BMI on liver fat and ALT. Heterozygous, rare, predicted loss-offunction variants in APOB have been described in other patients with cryptogenic cirrhosis and suggested to contribute to severe disease, including predisposition to hepatocellular carcinoma development. [17][18][19][20][21] In this study, the two patients with rare monogenic drivers of disease also had common variants in PNPLA3 and GCKR. Prior studies have demonstrated the need to consider the opposing impact of deleterious and protective variants and demonstrated a similar magnitude of opposing effects of variants in PNPLA3 and HSD17B13 on MRE. 22 When evaluating polygenic risk, consideration of the combination of common and rare variants may refine our understanding of the risk of NAFLD and fibrosis, as has been described in other diseases including cardiovascular disease and breast cancer. 23 This prospective, systematic assessment of patients with biopsy-proven lean NAFLD using WES adds new information about pathogenic and actionable rare variants in patients with lean NAFLD. Although the sample size is limited, this study involves well-phenotyped patients with detailed information on liver histology and advanced MRI, which differentiates it from large population-based studies in which most rare variant association studies of NAFLD have been performed. Furthermore, the detailed clinical evaluation allowed for confirmation of genotype-phenotype associations. External validation of the clinical significance of the rare variant in APOB in the UK Biobank is an additional strength of the study. APOB deficiency should be suspected in patients with NAFLD in the absence of hyperlipidaemia, in whom circulating APOB levels should be examined.
Unveiling the genetic aetiologies of disease in lean patients with NAFLD may lead to more targeted management, genetic screening of family members, and refined disease prognostication, and potentially uncover actionable pathways for drug development. Furthermore, uncovering the heterogeneous Val1856CysfsTer2 variant and all other APOB LOFHC variants corresponding to linear regression models adjusted for age, sex, and the first 10 principal components of genetic ancestry (liver fat models were additionally adjusted for MRI serial number). We combined all APOB LOFHC variants including p.Val1856CysfsTer2 to test for a BMI interaction and noted a significant interaction with both ALT (p = 0.008) and liver fat % (p = 6.3 × 10 -5 ). ALT, alanine aminotransferase; APOB, apolipoprotein B; liver fat %, image-derived liver fat percentage; LOFHC, high-confidence predicted loss-of-function; MRI, magnetic resonance imaging. molecular drivers of NAFLD and fibrosis may improve future clinical trial design by avoiding enrolment of patients with a subtype of disease unlikely to benefit. 16 In conclusion, in this well-characterised cohort of patients with biopsy-proven NAFLD, 33% of patients with lean NAFLD without visceral adiposity harboured monogenic disorders associated with fatty liver, highlighting the value of genetic assessment of NAFLD to identify distinct subtypes of disease.

Data availability statement
The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request in deidentified form.

Inclusion and Exclusion Criteria
Participants meeting any of the following criteria were excluded from the study: significant alcohol consumption (defined as ≥14 drinks/week for men or ≥7 drinks/week for women) within the previous 2-year period; evidence of active substance use. Alcohol intake history was obtained in a clinical setting and verified at the research clinic with the Alcohol Use Disorders Identification Test and the Skinner questionnaire. Other causes of liver disease and hepatic steatosis were ruled out systematically based on history and laboratory tests. Participants were instructed to fast for a minimum of eight hours before collection of laboratory tests.

Whole-exome sequencing and analysis
Germline DNA was extracted from blood samples using standard methods.
Germline DNA was captured using IDT xGen exome V2 exome enrichment probes and sequenced using the Illumina NovaSeq platform. Exome sequencing data were mapped and aligned to the reference human genome build 19 using Burrows-Wheeler Aligner.(1) Variants were called using GATK(2) and annotated using Annovar.(3) All variants passed an initial quality control and were filtered out for read depth of coverage < 30 and for segmental duplications (Figure 1). Protein-altering variants were selected by removing synonymous and intronic/non-coding variants. Variants were selected for minor allele frequency (MAF) of <0.01 for homozygous and compound heterozygous variants (recessive inheritance) or <2 x 10 -5 for heterozygous variants (dominant inheritance). MAF was determined using the genome aggregation database (gnomAD).(4) Variants were then prioritized based on predicted deleteriousness, using Combined Annotation Dependent Depletion (CADD)(5) score > 20 for missense variants and SpliceAI(6) score > 0.5 for splice-site variants. Remaining variants were flagged based on an internal list of 264 liver disease-related genes derived from Online Mendelian Inheritance in Man (OMIM) database entries, previously described. (7) Selected NAFLD-associated polymorphisms, namely PNPLA3 rs738409:p.I148M, GCKR rs1260326:p.P446L, TM6SF2 rs58542926:C/T, and HSD17B13 rs72613567:T/TA, MBOAT7-TMC4 rs641738:C/T were extracted from WES data.

UK Biobank Cohort and Analysis Phenotypes
The UK Biobank is an observational study that enrolled over 500,000 individuals between the ages of 40 and 69 years between 2006 and 2010 (8). Alanine aminotransferase (UKB field 30620) and apolipoprotein B (UKB field 30640) measured at the time of enrollment were made available to researchers. Imaging-derived liver fat was derived in 36,703 participants of the UK Biobank as previously described (9). This analysis of data from the UK Biobank was approved by the Mass General Brigham institutional review board and was performed under UK Biobank application #7089.

APOB rare variant validation
We conducted rare variant association studies using the first 200,643 exomes from the UK Biobank.(10) An extensive quality control procedure was applied to these data prior to analysis as described elsewhere (11). Following quality control, 200,337 exomes were available for analysis. To identify rare (minor allele frequency < 0.1%) 4 high-confidence predicted inactivating variants in APOB, we applied the previously validated Loss-Of-Function Transcript Effect Estimator (LOFTEE) algorithm implemented within the Ensembl Variant Effect Predictor (VEP) software program as a plugin, VEP version 96.0 (4). We refer to these variants are "LOFHC".

Statistical analysis
All effect sizes are reported from linear regressions adjusted for age (at enrollment for ALT and APOB, at the time of imaging for liver fat %), sex, and the first 10 principal components of genetic ancestry (liver fat analyses were additionally adjusted for MRI serial number). Prior to analysis, one sample of a pair was randomly excluded if that pair had second-degree relative or closer kinship. Carrier counts in Figure 2 are reported following this exclusion and correspond to participants with the studied phenotype available. Analyses were repeated in low (< 25 kg/m 2 ) and high (>= 25 kg/m 2 ) body mass index (BMI) subgroups. The interaction between APOB rare variant carrier status and BMI was tested using a linear regression including BMI and a carrier status by BMI interaction term along with the above covariates. All analyses were performed using R 3.6.0.       ), 175,341 UK Biobank participants were available for the present analysis. Effect sizes and standard errors used to generate 95% confidence intervals were obtained from Firth logistic regression, while p-values were obtained from the SPA test, both as implemented in the R package SPAtest. Models were adjusted for age, sex, and the first ten principal components of genetic ancestry. Note that the number of carriers listed in the table is fewer than those reported in the legend because of (1) removal of related samples prior to analysis and (2) several rare variant carriers having missing apolipoprotein B.