If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, SingaporeMontreal Heart Institute, Montreal, Quebec, CanadaDepartment of Medicine, University of Montreal, Quebec, Canada
Department of Hepatopancreatobiliary and Transplant Surgery, National Cancer Centre Singapore and Singapore General Hospital, SingaporeAcademic Clinical Programme for Surgery, Duke-NUS Medical School, Singapore
Department of Hepatopancreatobiliary and Transplant Surgery, National Cancer Centre Singapore and Singapore General Hospital, SingaporeAcademic Clinical Programme for Surgery, Duke-NUS Medical School, Singapore
Department of Hepatopancreatobiliary and Transplant Surgery, National Cancer Centre Singapore and Singapore General Hospital, SingaporeAcademic Clinical Programme for Surgery, Duke-NUS Medical School, Singapore
Department of Hepatopancreatobiliary and Transplant Surgery, National Cancer Centre Singapore and Singapore General Hospital, SingaporeAcademic Clinical Programme for Surgery, Duke-NUS Medical School, Singapore
Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, SingaporeGenome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), SingaporeCardiovascular Research Institute, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
Department of Hepatopancreatobiliary and Transplant Surgery, National Cancer Centre Singapore and Singapore General Hospital, SingaporeAcademic Clinical Programme for Surgery, Duke-NUS Medical School, SingaporeProgram in Clinical and Translational Liver Cancer Research, Division of Medical Science, National Cancer Center Singapore, Singapore
Epigenetic dysregulation is prevalent in HCC, accompanied by de novo enhancers.
●
Differential enhancers and the associated gene expression changes are heterogeneous.
●
Differential enhancer genes included cellular proliferation and fetal liver markers.
●
Patients with tumour activated gain-in-tumour enhancer genes showed worse prognosis.
Abstract
Background & Aim
Lifestyle and environmental-related exposures are important risk factors for Hepatocellular Carcinoma (HCC), suggesting that epigenetic dysregulation significantly underpins HCC. We profiled 30 surgically resected tumours and the matched adjacent normal tissues to understand the aberrant epigenetic events associated with HCC.
Methods
We identified tumour differential enhancers and the associated genes by analysing H3K27ac chromatin immunoprecipitation-sequencing (ChIP-seq) and Hi-C/HiChIP data from the resected tumour samples of 30 early stage HCC patients. This epigenome dataset was analysed with previously reported genome and transcriptome data of the overlapping group of patients from the same cohort. We performed patient-specific differential expression testing using multi-region sequencing data to identify genes that undergo both enhancer and gene expression changes. Based on the genes selected, we identified two patient groups and performed a recurrence-free survival analysis.
Results
We observed large scale changes in the enhancer distribution between HCC tumours and the adjacent normal samples. Many of the gain-in-tumour enhancers showed corresponding up-regulation of the associated genes and vice versa, but much of the enhancer and gene expression changes were patient-specific. A subset of the upregulated genes was activated in a subgroup of patients’ tumours. Recurrence-free survival analysis revealed that the patients with a more robust upregulation of those genes showed a worse prognosis.
Discussion/ Conclusion
We report the genomic enhancer signature associated with differential prognosis in HCC. Findings that cohere with oncofetal reprogramming in HCC were underpinned by genome-wide enhancer rewiring. Our results present the epigenetic changes in HCC that offer the rational selection of epigenetic-driven gene targets for therapeutic intervention or disease prognostication in HCC.
Lay summary
Lifestyle and environmental-related exposures are the important risk factors of Hepatocellular Carcinoma (HCC), suggesting that tumour-associated epigenetic dysregulations may significantly underpin HCC. We profiled tumour tissues and their matched normal from 30 early stage HCC patients to study the dysregulated epigenetic changes associated with HCC. By also analysing the patients’ RNA-seq and clinical data, we found the signature genes -- with epigenetic and transcriptomic dysregulation – associated with worse prognosis. Our findings suggest systemic approaches are needed to consider the surrounding cellular environmental and epigenetic changes in HCC tumours.
The cancer genome has revealed important insights, as evident from the widely accessed cancer genome databases, including the Cancer Genome Atlas (http://tcga-data.nci.nih.gov/) and the International Cancer Genome Consortium (https://dcc.icgc.org/). The key driver mutations have thus far been identified for different types of cancer
. In addition to genomic disruptions, epigenetic dysregulation is also a key to tumorigenesis, with early efforts primarily focusing on DNA methylation
. Other genome-wide techniques of ChIP-seq and Hi-C now elaborate on enhancer dysregulation and reveal new insights on distinct cancer-type specific enhancer patterns
. Well-known risk factors include chronic infection with HBV or HCV viruses, aflatoxin-contaminated foods, heavy alcohol consumption, and type 2 diabetes
. However, major risk factors for liver cancer appear to be shifting, given a declining prevalence of HBV or HCV and a corresponding increase in the prevalence of excess body weight and diabetes as risk factors for HCC in many countries
. The latter factors support the view that epigenetic dysregulation may significantly underpin HCC and, thus, the importance of studying the epigenome profiles.
Changes in histone marks and gene expression in liver cancer cells have been previously described in vitro
. Immunohistochemical analysis of H3K27 modifications – acetylation or methylation – in HCC showed variability and also a correlation to the degree of cellular de-differentiation and disease prognosis
. However, the enhancer landscape using H3K27ac ChIP-seq and Hi-C has never been assessed using human HCC patient samples. Among the different regulatory elements and histone marks, active enhancers marked by H3K27-acetylation (H3K27ac) are highly cell-type and cell-state dependent
. Therefore, to investigate the individuality and heterogeneity of HCC-related enhancers, we profiled H3K27ac using HCC liver tissues from 30 patients and mapped distal interacting genes by chromatin loop analysis from contemporaneous Hi-C profiles. By correlating the epigenomes of HCC samples to their corresponding genomes and transcriptomes, we analysed the functional and clinical implications of the dysregulated enhancers. A set of signature genes with correlated epigenetic changes proved useful in stratifying patient prognosis.
Results
Multi-omics data collection from early resectable HCC patients
We performed epigenome profiling for genomic enhancers using H3K27ac ChIP-seq and mapped gene-enhancer interactions using Hi-C or H3K27ac HiChIP on tumour and adjacent normal (adj.normal) tissues of 30 surgically resected HCC patients. The patients were part of a larger cohort whose genome and transcriptome data were previously published
. Overall, we used bulk RNA-seq and clinical data from 90 patients, of which 14 had scRNA-seq data and 30 had epigenome profiles. An overview of the multi-omics data collected and the overlap of patient groups between different profiling subsets are summarised in the schematic diagram in Fig. 1A and Table S1.
Fig. 1Differential enhancers in HCC tumour. (A) Schematic diagram of the overall project design. (B) Schematic diagram of how differential enhancers were identified (refer to Methods). (C) Emission and transition probability matrix from ChromHMM. (D) H3K27ac signal profiles at the gained and lost enhancers. Red: normal, blue: tumour. (E) Heatmap of H3K27ac signal from normal and tumour samples at differential enhancers, normalised to the respective control samples. Heatmaps are centred at the midpoint of each enhancer, and extended up and down stream by 5kb. Gained-in-tumour (top) and lost-in-tumour (bottom) enhancer regions are shown. The same heatmaps for all samples can be found in Fig. S1.
Differential enhancer loci are prevalent in HCC tumour
We first constructed a set of consensus enhancer loci for HCC tumours (Fig. 1B). Chromatin state with the highest emission probability of 0.935 for H3K27ac signal (state E5) was chosen as the enhancer chromatin state (Fig. 1C). H3K27ac ChIP-seq signals were then quantified at each locus. The quantified count matrix was then used for differential enhancer testing (see Methods). We identified 1,650 gain-in-tumour enhancer loci that showed a stronger H3K27ac signal in the tumour, compared to adjacent normal (adj.normal) samples (adjusted p-value < 0.05; log2 fold change > 1). Similarly, 1,415 lost-in-tumour enhancer loci were identified (log2 fold change < -1). Fig. 1E shows H3K27ac coverage at differential enhancers for a subset of samples (Combined profiles in Fig. 1D; a heatmap for the other enhancer landscape in Fig. S1). About 23.4% (717/3065) of the differential enhancer loci overlapped with H3K27ac chromatin states of the normal liver by the ROADMAP epigenomics project
. The largest overlap was with the Weak Enhancer state (Table S2), indicating the emergence of HCC-specific and patient-specific enhancer loci present in our dataset. Loci for gain-in-tumour and lost-in-tumour enhancers are listed in Table S3 and Table S4, respectively.
Differential enhancers overlapped with various genomic elements
Overall, the consensus enhancer loci showed similar proportion of overlap to gene promoters (23.23%) and to distal intergenic regions (24.76%) (Fig. 2A). Gain-in-tumour enhancers, however, showed larger overlap to distal intergenic loci while lost-in-tumour enhancers occurred more at the gene promoters. Hence, gained H3K27ac marks accumulated more at distal enhancer loci, whereas lost H3K27ac were localised to promoter enhancer loci, indicating enhancer rearrangement in HCC.
Fig. 2Enhancer rewiring is partly due to genomic mutations. (A) Pie charts showing proportions of different genomic elements in all enhancers (left), gain-in-tumour enhancers (middle), and lost-in-tumour enhancers (right). (B) Types of enhancers and the total number detected. (C) Number of enhancer loci detected across the adj.normal (top) and tumour (bottom) tissues of different numbers of patients. (D) Stacked percentage barchart of types of changes in the TAD boundaries between different pairs of samples, analysed using TADCompare
. “NvsN” – adj.normal to adj.normal, “TvsT” – tumour to tumour. “inter-patient” – between samples of different patients, “intra-patient” – between samples of the same patient. (E) Heatmap of the type of mutations at mutation hotspots identified from MutEnricher
. Left heatmap shows mutation hotspots at the gain-in-tumour enhancer regions, while the right heatmap shows mutation hotspots at the lost-in-tumour enhancer regions. Each row represents a mutation hotspot, corresponding to an enhancer locus. Barplots next to each heatmap represents the number of patients from which the mutation hotspot was detected. A mutation hotspot was considered truncal (red) if it was detected in all tumours of a patient, while mutations occurring in certain tumour sectors of a patient were considered as branch mutations (black). (F) Enriched somatic mutations at the most frequent mutation hotspot detected in part E. Numbers in parenthesis represent the number of tumour sectors in which the mutation was detected from. The enhancer, “chr5:1295700” (denoted in grey bar), overlaps with the TERT promoter (denoted in blue bar) and is a gain-in-tumour enhancer.
Based on Hi-C and H3K27ac HiChIP data, gene-enhancer association was inferred by either (1) the direct occurrence of enhancers at a gene promoter or (2) distal enhancers in contact with a gene promoter, assuming that any changes at a distal enhancer would also affect the promoter in contact with the enhancer locus. There were more distal enhancers than promoter enhancers and strikingly many promoter enhancers were in loop contact with other promoters (Fig. 2B).
Changes at the gain-in-tumour enhancers were more prominent than those at the lost-in-tumour enhancers
The H3K27ac coverage heatmap showed that adj.normal H3K27ac enrichment was faint for gain-in-tumour enhancers while tumour H3K27ac enrichment for lost-in-tumour enhancers was, in contrast, clearly visible (Fig. 1E). The presence of de novo enhancer loci at gain-in-tumour enhancers was further supported by the frequency of unique enhancers observed in tumour tissues (tall bar at the zero column in Fig. 2C, top). On the other hand, only a very small number of enhancers were uniquely detected in adj.normal tissues (Fig. 2C, bottom). Moreover, more chromatin loops were detected across tumour samples (97,255 pairs) compared to 28,116 pairs in adj.normal. A higher number of detected loops in tumour samples also suggested the creation of de novo loops in HCC tumours. Altogether, these results indicate that H3K27ac mark changes are more prominent at gain-in-tumour enhancers with the creation of de novo enhancers and loops.
Few mutation hotspots existed under differential enhancers, and they were highly patient-specific
To investigate whether somatic mutations underlie changes in tumour H3K27ac enrichment, we looked for these at the differential enhancer loci. Whole Genome Sequencing (WGS) data of the 30 patients in the epigenome cohort were used in this analysis. Among all gain-in-tumour enhancers, only ∼20% (346 out of 1,650) were enriched with 450 unique mutation hotspots. Similarly, for lost-in-tumour enhancers, ∼18% (256 out of 1,415) were enriched with 323 unique mutation hotspots (Fig. S3). However, even though the identified mutation hotspots presented a strong statistical base from MutEnricher
, most of them were detected uniquely in only one or two patients and many branch mutations occurred uniquely in only subsets of tumour sectors for some patients (Fig. 2E, Fig. S2B). The high degree of individuality in the enhancers in HCC was also consistent with the high number of patient-specific tumour enhancers (Fig. 2C, bottom). One exception was the somatic mutation detected under a gain-in-tumour enhancer that overlapped with the TERT promoter enhancer (chr5:1,295,113, G>A) (Fig. 2F), which appeared to be the only true mutation hotspot under a differential enhancer. Even though the enhancer locus overlaps with the TERT promoter, the somatic mutation locus is upstream of the promoter. This enhancer also showed a significantly higher level of H3K27ac in tumour samples and many of our patients did not harbour the known TERT activation mutation. Altogether, our observations suggest genetic mutations are not the main drivers of genome-wide enhancer changes in HCC.
Tumour samples showed variable H3K27ac signal at the differential enhancers
We found that K-means clustering (based on H3K27ac signals at gain-in-tumour enhancers) separated the samples into at least 3 groups (Fig. 3C). Any more than the cluster size of 3 resulted in group samples containing one or two patients only. The three “enhancer clusters” (eClusters) A, B, and C showed increasing H3K27ac signals at gain-in-tumour enhancers, and, conversely, decreasing H3K27ac signals at lost-in-tumour enhancers (Fig. 3D). The heatmap of H3K27ac signals at gain-in-tumour and lost-in-tumour enhancers which grouped by eCluster, showed homogenous H3K27ac signals from adj.normal samples while tumour samples showed highly variable H3K27ac signals (Fig. 3A, Fig. 3B). Adj.normal samples from all patients formed the majority of samples in eCluster A. A few tumour samples that showed highly similar H3K27ac pattern to adj.normal samples were also clustered to eCluster A. We also noticed that patients in eCluster C, the more divergent group with H3K27ac signal distinctly different from adj.normal, showed a higher degree of de-differentiation, a higher rate of disease recurrence (Fig. 3E), and shortened time to recurrence (Fig. 3F).
Fig. 3Heterogeneity in the H3K27ac signal at differential enhaners and the associated gene expression changes. (A) Heatmap of H3K27ac ChIP-seq signal at the gain-in-tumour enhancers. Enhancers shown on the rows were ordered using hierarchical clustering. H3K27ac ChIP-seq samples are shown on the columns, ordered by the enhancer clusters identified by K-means clustering. (B) Heatmap of H3K27ac ChIP-seq signal at the lost-in-tumour enhancers. (C) PCA plot of the samples shown in A and B, grouped by the enhancer clusters. (D) Total H3K27ac signal at the differential enhancers (left - gain-in-tumour; right - lost-in-tumour), grouped by the enhancer clusters. (E) Proportion of eClusters assigned to adj.normal and tumour tissues of each patient (top and middle), and the recurrence status for the corresponding patients with tumour tissues sequenced (bottom). All bar plots are grouped by the Edmondson Grade of the patients’ tumour sample. (F) Boxplot of disease-free survival for patients shown in E, grouped by the enhancer clusters. (G) Gene Set Enrichment Analysis results for the genes associated with the gain-in-tumour enhancers (top) and lost-in-tumour enhancers (middle and bottom). (H) Proportion of differentially expressed genes for genes associated with the gain-in-tumour enhancers (top) and with the lost-in-tumour enhancers (bottom). Results obtained from the conventional differential expression testing. (I) Heatmap of categorised expression changes matrix from patient-specific differential expression testing. Top shows the gain-in-tumour associated genes while the bottom shows the lost-in-tumour associated genes. Colors of blue, light yellow, and red each represent the expression changes of up-regulation, no change, and down-regulation.
Enhancers in HCC showed high degree of inter-individual variations
Among the detected chromatin loops in tumour samples, only 0.06% (60 out of 97,255) were shared between all samples while many of the loops were uniquely detected in only one sample (75.0%, 72,960 out of 97,255). We also compared Topologically Associated Domain (TAD) boundaries between adj.normal and tumour samples from two patients. Inter-patient comparisons showed high TAD boundary differences compared to intra-patient comparisons (Fig. 2D). Tumour samples from different patients showed greater TAD boundary differences as compared to the adj.normal samples, indicating a higher degree of divergence created by tumourigenesis.
Notably, a similar observation was made for enhancer heterogeneity. We have performed H3K27ac ChIP-seq on at least two tumour sectors for 12 out of 30 patients. Three patients who had tumour sectors assigned to different eClusters showed intra-tumour heterogeneity (Fig. 3E). Among the 12 patients, 6 patients had 3 tumour sectors sequenced. For these 6 patients, we compared the enhancer loci detected in each sample to the merged consensus enhancer loci. This validated that many consensus enhancer loci were not detected in any of the 6 patients’ tumour samples, again indicating a high degree of inter-individual heterogeneity (Fig. S2).
Genes associated with differential enhancers indicated variable levels of de-differentiation and cellular proliferation
By associating enhancers to gene promoters with chromatin loops from Hi-C and HiChIP data, we identified 1,300 genes associated with gain-in-tumour enhancers and 1,360 genes associated with lost-in-tumour enhancers. There were many single gene promoter to multiple enhancer associations, and vice versa. Differential gene expression analysis was performed between tumour and adj.normal using the bulk RNA-seq data available from the same group of 29 patients. Although stronger H3K27ac signals in tumours did not always translate to a stronger expression of the associated gene, many gain-in-tumour enhancer-associated genes were indeed upregulated in tumour as compared to adj.normal (Fig. 3H). A similar observation was made for lost-in-tumour enhancer-associated genes. Enhancer changes therefore broadly correlated with transcriptional changes and motivated us to analyse the gene-set more closely.
Many differential enhancer-associated genes encoded for transcription factors (TFs, Fig. S5) may explain the genome-wide rearrangement of enhancers. Noticeably, more lost-in-tumour enhancer-associated genes overlapped with cell differentiation marker genes could indirectly support the process of de-differentiation in HCC. This observation was consistent with the enrichment of the HNF4A binding motif underlying gain-in-tumour enhancers (Fig. S4) as HNF4A is a known marker for hepatic progenitor cells
. Gain-in-tumour enhancer-associated genes showed strong enrichment for cell proliferation and regulation of metabolic and biosynthetic processes (Fig. 3G, top). Lost-in-tumour enhancer-associated genes were mainly those of signal transduction, biological adhesion, and epithelial-mesenchymal transition processes (Fig. 3G, middle and bottom). Notably, the patients whose tumour samples were assigned to eCluster 3 which showed the most divergent enhancer pattern from adj.normal, had the poorly differentiated Edmondson Grade 3 tumour and all showed the shortest time to recurrence, compared to the patients whose tumours belonged to eCluster 1 or 2 (Fig. 3E,F).
Both enhancer patterns and expression of the associated genes were heterogeneous between patients
Many associated genes showed overall expression changes concordant with the enhancer changes (Fig. 3H). However, given the inter-individual variability of H3K27ac signals at differential enhancer loci, we hypothesized that transcriptional changes of associated genes might also be patient-specific. We performed per-patient differential expression analysis by following the approach taken by Jeon et al.
. In this approach, the multi-region tumour samples were treated as biological replicates to perform differential expression testing between each patient’s tumour sectors and the matched adj.normal. Among the 30 patients with epigenome profiling, we selected 25 patients who had adj.normal tissue and at least 2 tumour sectors sequenced for RNA-seq. We assigned three categories – -1,0, and +1 – to denote downregulation, no change, and upregulation of the gene in each patient’s tumour. Fig. 3I shows the categorised expression changes in differential enhancer-associated genes for all 25 patients. Overall, differential expression results from conventional analysis (Fig. 3H) and that of the patient-specific analysis (Fig. 3I) were broadly similar. The patient-specific analysis, however, provided granularity and information for each patient on whether a specific gene was dysregulated in the patient’s tumour tissue. We then used the categorised differential expression values for patient stratification in the larger group of 90 patients.
Two examples of genes – SOX4 and GPC3 – in Fig. 4 exemplify the variability and patient specificity in enhancer patterns and gene expression changes. For SOX4, many enhancers linked by chromatin loops were unique to tumour tissues (Fig. 4A, bottom). Only one SOX4 distal enhancer showed a significant H3K27ac increase, and notably the H3K27ac signal was highly variable between patients (Fig. 4C). For some patients (for example, C001 and B011), H3K27ac was barely detectable, while in one patient (HEP262), there was markedly increased enrichment in tumour tissues. This is thus an example of a highly variable enhancer locus that exhibited a high degree of patient-specific heterogeneity. The variability was also seen in SOX4 gene expression. Patient-specific differential expression analyses revealed SOX4 to be upregulated in a subset of the 90 patients, where some of them even showed SOX4 downregulation in their tumour tissues (Fig. 4E). On the other hand, GPC3 is an example of a gene with more consistent dysregulation across patients. GPC3 promoter and distal enhancers showed significant and consistent H3K27ac in tumour samples (Fig. 4B) across all patients (Fig. 4D). GPC3 was also upregulated consistently in a larger group of patients (Fig. 4F).
Fig. 4Patient-dependent H3K27ac signal at the associated enhancers and expression changes of SOX4 (left) and GPC3 (right). (A) H3K27ac ChIP-seq signal at the enhancers associated with SOX4. Differential enhancer status of each enhancer is denoted in yellow. Chromatin loops were inferred from Hi-C/HiChIP libraries, and are colored based on the frequency of detection in Hi-C/HiChIP libraries. (B) Similar plot as A for GPC3. (C) Normalised H3K27ac coverage at the differential distal enhancer of SOX4, for the paired adj.normal (blue, bottom) and tumour (red, top) tissues of 6 patients. (D) Similar plot as C for the differential promoter enhancer of GPC3. (E) Per-patient differential expression testing results for SOX4 across 90 patients. Normalised gene counts for each adj.normal (blue) and tumour (pink) samples are shown on top, while the bottom plot shows the log2 fold change values for those detected as differentially expressed (blue bars: upregulated in tumour, green bars: downregulated in tumour, grey: not significantly different). In both plots, each column represents a patient. (F) Similar plot as E for GPC3.
showed a striking overlap between the gain-in-tumour enhancer-associated genes and marker genes for specific cell types (pancreas ductal cells, fetal liver hepatoblasts, and EpCAM-positive bile duct cells) (Fig. 5A). Linear regression analysis showed that some genes, such as GPC3, MSI1, PEG10, and ELOVL2, also showed a strong positive correlation to serum AFP levels in corresponding patients (Fig. 5D, Fig. S7). By contrast, other genes, such as UGT2B7 and PROX1, showed a negative correlation. The correlation was present only in tumour tissues, whereas gene expression across adj.normal tissues was largely unchanged. Variable gene expression in tumours may be partly associated with the extent of fetal-like transformation taking place in the liver.
Fig. 5Gain-in-tumour associated genes show enrichment for fetal liver genes. (A) GSEA result of gain-in-tumour enhancer-associated genes against cell type signature gene set C8 from MSigDB
. (B) Heatmap of H3K27ac ChIP-seq signal for the gain-in-tumour enhancers associated with the fetal liver-expressed genes, ordered by the enhancer cluster. (C) Hierarchical clustering of different liver tissues based on the amount of cells expressing fetal liver and gain-in-tumour enhancer-associated genes. Genes were further filtered based on the availability in the scRNAseq data. (D) Selected pool of genes which showed significant correlation between its expression level and the serum AFP level. More genes are shown in Fig. S6.
Among all the gain-in-tumour enhancer-associated genes, we looked at 46 fetal liver hepatoblast markers (Table S5). Gain-in-tumour enhancers associated with these 46 genes indeed showed increased H3K27ac enrichment in tumour tissues, especially those in eCluster C (Fig. 5B). Since these genes with corresponding enhancer changes were the hallmark genes for fetal liver, and some showed significant correlation to serum AFP levels, we termed this gene-set as “epigenetic oncofetal genes”. We hypothesize that when HCC cells are de-differentiated through the emergence of cancer stem cells or hepatic progenitor cells, “epigenetic oncofetal genes” are upregulated, whereas they are not normally expressed in a non-diseased liver. This hypothesis also concords with the preceding observation of HNF4 binding motif enrichment at gain-in-tumour enhancer loci. Moreover, gain-in-tumour enhancer-associated genes were related to processes of proliferation and de-differentiation.
We explored whether H3K27ac and corresponding gene expression variability in tumour tissues was due to a differing proportion of poorly differentiated cells or hepatic progenitor cells. Single-cell RNA-seq of 14 patients were analysed for the expression of the epigenetic oncofetal genes, as well as SOX4 and HNF4A. From the scRNA-seq, we focused on cells identified as either endothelial cells or hepatocytes. We fitted a mixture model to select genes that showed bimodal expression and used gene-specific thresholds to binarise the gene expression values. The proportion of cells expressing a gene of interest was then calculated by taking the ratio of cells expressing the gene to the total number of cells. Using hierarchical clustering, we clustered the samples that comprised 1 adult non-diseased liver tissue, 2 fetal liver tissues, and adj.normal and tumour tissues from 14 HCC patients based on the proportion of the cells expressing the epigenetic oncofetal genes (Fig. 5C). Clustering revealed two groups of samples, one that was non-diseased liver-like and the other fetal liver-like. All but one adj.normal sample was grouped into the non-diseased liver-like group, so we termed the first group as “adj.normal-like”. Many adj.normal-like liver tissues showed a low proportion of cells expressing epigenetic oncofetal genes. The other group, on the other hand, showed a distinctly higher proportion of cells expressing these epigenetic oncofetal genes. Genes such as GPC3 and PEG10 were expressed in the majority of cells in both fetal-liver tissues, and also in some patient tumours. In Sharma et al.
, the authors in fact selected P8 and P15 patient tumour tissues for validation and showed experimentally that these tumours exhibited oncofetal reprogramming of endothelial cells together with an immune escape from the surrounding T-cells. In our analysis here, both P8 and P15 patient tumours (P8_Tumor and P15_Tumor) clustered with fetal liver tissues (Fig. 5C). We, therefore, termed the second group as “oncofetal-like”. This subgroup of tumours showed a high number of cells expressing epigenetic oncofetal genes, with a similar expression pattern to that of fetal-liver tissues. The number of cells expressing SOX4 was variable across all tissues, but HNF4A was markedly positive in many cells from the oncofetal-like liver tissues. It is also noteworthy to mention that Domcke et al.
also reported that the genomic loci with chromatin accessibility specific to the fetal hepatoblast cells were enriched with the HNF4A motif. Altogether, our integrated analysis with scRNA-seq highlights that epigenetic oncofetal genes are indeed associated with HCC tumours that are fetal liver-like, with characteristics of oncofetal reprogramming.
Gain-in-tumour enhancer-associated genes that were patient subgroup-activated showed prognostic value
The patient-specific differential expression analyses revealed that tumour-related genes were dysregulated to varying degrees in different patients. Some genes, possibly related to common cancer pathways, were shared in a large group of patients, while other genes were only dysregulated in a small subgroup of patients. Our earlier work which investigated this observation in more detail had shown that such subgroup-activated genes were clinically relevant
. Since the list of signature genes that we identified based on enhancer changes harbored many proliferation, de-differentiation, and oncofetal reprogramming related genes, we hypothesized that subgroup-activated genes among them must present a strong correlation to recurrence. We hence applied these genes to patient stratification and disease-free survival analysis.
Given the earlier observations of the genes relating to de-differentiation and developing fetal-like characteristics, we hypothesized that the differential activation pattern of some of the gain-in-tumour enhancer-associated genes might be associated with different rates of HCC recurrence. We obtained the categorised expression changes matrix for 90 patients from their multi-sector bulk RNA-seq data and selected the differential enhancer-associated genes (Fig. S6A). These genes varied from being upregulated in just a few patients to more commonly across the patients (Fig. S6B). With the focus on gain-in-tumour enhancer-associated genes, we filtered out genes that were upregulated in only a small group of patients as these may be related to individual patient characteristics, rather than a shared tumour profile. We finally narrowed the gain-in-tumour enhancer-associated genes down to 185 genes that were upregulated in at least 30% of patients. The categorised expression changes matrix showed that even though these genes were upregulated in many patient tumours, some genes still showed downregulation in some patients (Fig. 6A top, Fig. S6A). We employed co-clustering and identified two groups of patients (“pClusters”) – PC0 and PC1 – based on the categorised expression changes matrix of the 197 genes (Fig. 6A, Table S7). PC1 patients showed more gene upregulation in their tumour tissues, compared to their adj.normal tissues. The recurrence-free survival analysis revealed that PC1 patients showed a significantly worse prognosis with a shorter time to recurrence (Fig. 6B). Median time to recurrence was 1107 days for PC0 patients and 659 days for PC1 patients. Comparing clinical parameters of PC0 and PC1 patients showed significant differences in the degrees of fibrosis and steatosis and hepatitis B status (Table 1). PC1 group consisted of more patients with chronic hepatitis B infection, while the PC0 group showed more patients with high levels of steatosis (Fig. 6C). Our earlier observation showed high Edmondson Grade patients to have high H3K27ac intensity at gain-in-tumour enhancers in their tumour tissues. Even though the Edmondson Grade was not significantly different between PC0 and PC1 (Table 1), a cumulative incidence plot between patient cluster and Edmondson Grades showed that PC1 patients had a higher rate of recurrence, regardless of the Edmondson Grade (Fig. 6D). This suggests that detectable molecular characteristics stratify those with higher recurrence rate, regardless of their histological assessment of cellular de-differentiation. This observation may be especially valuable for patients with Edmondson Grade 1 and 2, even though they are usually considered to have low recurrence risk in current clinical practice.
Fig. 6The patient-specific upregulation pattern of gain-in-tumour enhancer-associated genes shows prognostic values. (A) Heatmap of patient specific differential expression status for genes upregulated in at least 30% of the patients. Bottom heatmap shows the effect of co-clustering and the columns are colored by the assigned patient strata. (B) Disease-free survival plot and the risk table for the two patient clusters identified from co-clustering (Tarone-Ware test p-val: 0.038). (C) Distribution of clinical parameters that were significantly different between the two patient clusters. The full breakdown of the clinical tables is shown in Table 1. (D) The cumulative event plots for patients with different Edmondson Grades, grouped by the patient clusters. (E) Volcano plot of differential expression testing between PC0 and PC1 from RNA-seq data. (F) Heatmap of tumour-normal difference expression values of the epigenetic oncofetal genes from TCGA-LIHC RNA-seq data, grouped by the patient clusters identified from co-clustering. (G) Progression-free survival plot and the risk table for the two patient clusters identified from TCGA-LIHC data (Tarone-Ware test p-val: 0.035).
Table 1Clinicopathological table for patient clusters PC0 and PC1. Full clinical variables table is included in Table S6. Level of significance: p <0.05 (Chi-square test and Wilcoxon test).
To identify differences in patient tumour tissue profiles, we compared tumour tissues between PC0 and PC1 patients through differential expression testing of their RNA-seq data. Many upregulated genes included hallmark genes of glycolysis, MYC targets, and cell cycle-related genes (Fig. S8A). PC1 patient tumours also highly expressed genes related to liver cancer metastasis and subgroups of HCC with stem cell characteristics. Downregulated genes in PC1 patient tumour tissues included those related to the regulation of the immune system process, defense response, interferon-gamma response, and biological adhesion (Fig. S8B). Genes related to glucose metabolisms and cell cycles, such as G6PD and CDK1 were upregulated in PC1 patients, while genes related to immune response (IFNG and CXCL6) were downregulated (Fig. 6E). The results indicate that our signature genes were indeed able to identify patient subgroups with more proliferative tumour tissues and greater potential to metastasize, possibly with suppressed immune response and metabolic processes.
Further studies are required to investigate the molecular mechanism behind epigenetic dysregulation and HCC. One possible mechanism, however, could involve the differential recruitment of transcription factors. For the transcription factors with motifs enriched in gain-in-tumour enhancers (Fig. S4), we studied the expression level changes of the predicted target genes in each pClusters (PC0 and PC1) (Table S9). More patients in PC1 showed up-regulated target genes than those in PC0, indicating that the transcription factors may be the connection between the enhancer dysregulation and the downstream transcriptional changes, with clinical relevance in prognosis.
The epigenetic oncofetal genes identify patients with a more aggressive HCC subtype
We compared our 185-gene panel used in patient stratification to previously reported genes related to the HCC subtypes of different prognoses from Hoshida et al.
, and only one poor survival gene, ELOVL2, were found in our 185 genes. Despite the minimal overlap between the gene lists, the expression changes in the pattern of survival genes from Hoshida et al. were in broad agreement for our two identified patient clusters, PC0 and PC1 (Fig. S9). Many positive survival genes were downregulated across PC1 tumour tissues. The expression changes of poor survival genes were less distinguished between the two patient clusters. Importantly, we again noted that expression changes were variable and patient-specific.
To compare our tumour subtypes to previously reported subtypes by Boyault et al.
, we tabulated the somatic mutation and gene expression features of the tumours in each patient group (Table S8). Upregulated genes in PC1 tumours – which showed a worse prognosis – showed enrichment of the BOYAULT_LIVER_CANCER_SUBCLASS_G3_UP gene set (Figs. S8 and A). PC1 tumours also showed a higher percentage of TP53 mutations (Table S8), which was one of the mutational features of the G2 and G3 subclasses by Boyault et al.
had indeed reported that the G3 subclass showed the worst prognosis, even though the difference did not reach statistical significance. It shows that the subtypes we discovered may have a more selective prognostic value. About Hoshida’s subtypes
, we believe PC1 tumours correspond to Hoshida’s S2 subtype, which was highlighted with a high AFP level, poor differentiation, and poor survival. Similarly, PC1 tumours showed association to EPCAM positive cells (Fig. 5A), a high level of proliferation (Fig. 3G, top), a high level of serum AFP (Fig. 5D), and upregulation of MYC, PI3K-AKT, and E2F1 targets (Figs. S8 and A). All of them corresponds to the features of the S2 subtype described by Hoshida et al.
. Altogether, we believe our 185-gene panel has a higher specificity in identifying more aggressive HCC subtypes, and our findings show that the dysregulation of these genes is associated to the epigenetic changes that happen in the liver.
Finally, we validated our 185-gene panel using the TCGA-LIHC dataset by stratifying HCC patients into clinically-relevant groups. Since the TCGA-LIHC dataset does not have similar multi-region sampling data that was used in our stratification, we stratified TCGA patients based on tumour-normal differences calculated using the average expression values among the normal samples. Our gene panel separated TCGA-LIHC patients into 2 clusters, cluster A and cluster B, consisting of 100 and 224 patients, respectively (Fig. 6F). From the normalised gene expression heatmap (Fig. 6G), a set of genes showed noticeably higher expression in cluster A patients than cluster B patients. The top 6 differentially expressed genes are GPC3, ACSL4, PEG10, PRAP1, AKR1C2, and AKR1C3 (Fig. S10). The progression-free survival analysis between the two patient clusters showed that indeed the patients in cluster A (higher upregulation) showed worse prognosis (Fig. 6G), similar to our earlier results (Fig. 6B). The results show that our gene panel has the selection power for stratifying patients into differential prognosis groups, as proven in both our dataset as well as in the TCGA-LIHC dataset.
Discussion
Our systematic epigenome analysis of HCC reveals a highly variable enhancer distribution in HCC that is important, and to a large degree, patient-specific. Functional analysis of genes associated with differential enhancers and the clinical profile of patients with different enhancer signals showed that the different extent of cellular proliferation and de-differentiation of tumour cells appear to be key factors contributing to the heterogeneity.
There was minimal overlap between somatic mutation hotspots and differential enhancer loci, indicating that genetic perturbation appears not to be the main driver of HCC enhancer dysregulation. The potential cause for H3K27ac landscape changes may instead be metabolic changes in the tumour microenvironment. Our results showed that some of the most striking differences between PC0 and PC1 patients were the hallmarks of glycolysis and PI3K-AKT-MTOR signalling genes. These may reflect the Warburg effect in HCC tissues to varying degrees among patients. A shift in glucose metabolism has been reported in HCC
. Further studies are required to assess this possibility.
Oncofetal reprogramming of endothelial cells in HCC and the emergence of fetal-liver-like characteristics in the tumour microenvironment of HCC tissues have been described in detail by Sharma et al.
. We confirmed a set of epigenetic-driven oncofetal genes, which were expressed by a higher proportion of cells in oncofetal-like tissues, indicating that the development of fetal-liver-like features in some HCC liver tissues is likely accompanied by the de novo (gain-in-tumour) enhancers. As the epigenetic oncofetal genes are gain-in-tumour enhancer-associated genes known to fetal liver hepatoblast markers, our findings are concordant with the re-emergence of fetal liver marker genes in some HCC patients together with underpinning enhancer rewiring. Furthermore, we have shown that some of the epigenetic oncofetal genes were informative in identifying patient groups with differential prognosis.
Altogether, we present an integrative overview of the epigenomic and transcriptomic dysregulation in HCC, with an emphasis on patient-dependent heterogeneity with direct clinical relevance. Prior to developing HCC, patients often show progressive stages of various chronic liver diseases. Recent studies suggest that the liver undergoes dynamic and aberrant epigenetic changes accompanying metabolic changes
suggest the potential effectiveness of epigenetic drugs such as bromodomain inhibitors in chemoprevention. Our findings add to an ongoing approach that presents a paradigm shift in cancer research, from the convention of identifying specific targets for intervention, to more systemic approaches that consider the tumour and the surrounding cellular environment in developing treatment strategies. For widespread and systemic genome-wide dysregulation, epigenetic therapies that can target genome-wide loci could be beneficial. Indeed, a panel of genes appears to undergo coherent epigenetic and transcriptional dysregulation in a subgroup of HCC patients. Understanding the variability and developing methods to identify patient groups with different epigenomes will be important for the potential application of personalised cancer therapeutics.
Methods
Patient recruitment, sample preparation and sequencing
Thirty patients were recruited from three local hospitals (National Cancer Centre Singapore, Singapore General Hospital, and National University Hospital) collaborating under the auspices of the Asia-Pacific Hepatocellular Carcinoma (AHCC) trial group
. The inclusion and exclusion criteria used during the patient recruitment are shown in Table S10. This included 12 patients enrolled in the Translational and Clinical Research (TCR) Flagship Programme: Precision Medicine in Liver Cancer across an Asia-pacific NETwork (PLANet; NCT03267641), funded by the Singapore National Medical Research Council (NMRC). This study has been approved by Singhealth Centralised Institutional Review Board (2016/2626 and 2018/2112). Informed consent was taken from each patient before enrollment. These patients had undergone liver resection and grid sampling with multi-region sampling was performed on the tumours as previously described
while Hi-C and HiChIP were performed using the Arima Hi-C Protocol described in the Arima Hi-C Kit (Material Part Number: A410110 Document Part Number: A160162 v00). The details of the library preparation were provided in Supplementary Methods 1.0.
Sources of data from previously published work
Of the 90 patients who contributed transcriptome data, the genome and transcriptome data of 44 patients have been previously reported by Zhai et al.
. Loops were called with HiCCUPS at 10kB resolution. Contact domains were called at 50kB resolution, and differential Topologically Associated Domains (TAD) were analysed using TADCompare
was used to identify enhancer loci with significant differences in H3K27ac ChIP-seq coverage between the tumour and adj.normal samples (Supplementary Methods 3.0). Adjusted p-value less than 0.05 and the absolute value of log2 fold change greater than 1 were used to identify differential enhancers.
Promoter-enhancer association
We filtered enhancer loci based on either, (1) the enhancer locus overlaps with the promoter region (“promoter enhancer”), or (2) any of its interacting locus based on detected chromatin loops overlaps with the promoter region (“distal enhancer”). The promoter regions were inferred from the known transcription start site (TSS) loci (downloaded from UCSC genome browser Table Browser
for all known genes in hg38) to 500bp upstream. For the first category, we used bedtools intersect to identify enhancer loci occurring directly at the promoters. For the second category, we merged all loops called from Hi-C and HiChIP data in BEDPE, intersected first with the enhancer loci BED, then again with the promoter regions using bedtools pairToBed. The transcript ID assigned to each promoter locus was used to map the gene to the enhancer.
Visualisation of ChIP-seq and Hi-C data
Normalised ChIP-seq signal with the respective input control was obtained using deepTools bamCompare
in R was used to plot coverage and chromatin loops. Computematrix and plotHeatmap from deepTools were used to generate ChIP-seq heatmaps. For visualisation, the heatmaps were limited to those enhancer loci with a width below 10kbp. The midpoint of each enhancer locus was taken and extended by 5kbp up- and downstream for heatmap generation.
Pie charts in Fig. 2A were generated using the ChIPseeker package in R
. We first filtered cells in clusters identified as hepatocytes and endothelial cells. We binarised the normalised expression counts by fitting a bimodal distribution. We followed the previously published work for the overall methodology for bimodal gene detection
was used to fit the multimodal distribution. For each gene, the binarising threshold was determined by the expression value where the minimum point of density occurs. For each patient, we then calculated the percentage of cells expressing the gene of interest based on the binarised gene expression counts.
DNA binding motif analysis at differential enhancer loci
We performed DNA binding motif analysis at differential enhancer loci using the findMotifsGenome function in HOMER
, with the consensus enhancer loci as the background loci and the following parameters: “-size 800 -preparse -bg -noweight”. The hTFtarget database was used to infer the target genes corresponding to the motif
Patient stratification using per-patient differential expression values
Log fold-change matrix is obtained from per-patient analysis and simplified into a categorical matrix with 3 values -- -1, 0, or 1 -- representing the down-regulation, no change, or up-regulation of the gene in the tumour samples compared to the patient's normal samples respectively. Co-clustering of this categorical matrix is performed using the blockcluster R package, which stratifies the patients into 2 patient clusters.
. All p-values are obtained using the Tarone-Ware test. Disease-free survival analysis was done using recurrence-free survival days in our dataset, while progression-free interval (PFI) days were used for the TCGA dataset, as recommended by TCGA guidelines.
Data availability
The Hi-C/HiChIP, ChIP-seq and scRNA-seq data analysed or generated during the current study are available in the Gene Expression Omnibus (GEO) with accession numbers GSE212055, GSE212342, and GSE156337, respectively. The DNA and RNA sequencing data are deposited in the European Genome-Phenome Archive (EGA) under accession EGAD00001009041. All data supporting the findings of this study are available within the article and its Supplementary Materials, or from the corresponding author upon request.
Acknowledgements
We thank all patients involved in this study. We thank members of NCCS, SGH, NUH, GIS, and the Liver TCR group for the all the work and effort involved. This work was supported by Singapore National Medical Research Council Grants (TCR/015-NCC/2016, CSA-SI/0018/2017, and CIRG18may-0057).
Abbreviation
adj.normal
Adjacent normal
AHCC
Asia-Pacific Hepatocellular Carcinoma
ChIP-seq
Chromatin immunoprecipitation-sequencing
eClusters
Enhancer clusters
EGA
European Genome-Phenome Archive
GEO
Gene Expression Omnibus
H3K27ac
H3K27-acetylation
HBV
Hepatitis B virus
HCC
Hepatocellular Carcinoma
HCV
Hepatitis C virus
NMRC
National Medical Research Council
PC
Patient cluster
PLANet
Precision Medicine in Liver Cancer across an Asia-pacific NETwork
RNA-seq
RNA sequencing
scRNA
seq single-cell RNA-sequencing
TAD
Topologically Associated Domains
TCGA-LIHC
The Cancer Genome Atlas Liver Hepatocellular Carcinoma
TCR
Translational and Clinical Research
TSS
Transcription start site
Appendix A. Supplementary data
The following is/are the supplementary data to this article: