Research Paper Volume 15, Issue 8 pp 3141—3157

Profiling and integrated analysis of transcriptional addiction gene expression and prognostic value in hepatocellular carcinoma

Xiaowei Du1, , Hao Wang2, , Jing Xu2, , Yufei Zhang2, , Tingsong Chen2, , Gao Li2, ,

Received: January 6, 2023       Accepted: April 15, 2023       Published: April 22, 2023      

https://doi.org/10.18632/aging.204676

Copyright: © 2023 Du et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Transcriptional dysregulation caused by genomic and epigenetic alterations in cancer is called “transcriptional addiction”. Transcriptional addiction is an important pathogenic factor of tumor malignancy. Hepatocellular carcinoma (HCC) genomes are highly heterogeneous, with many dysregulated genes. Our study analyzed the possibility that transcriptional addiction-related genes play a significant role in HCC. All data sources for conducting this study were public cancer databases and tissue microarrays. We identified 38 transcriptional addiction genes, and most were differentially expressed genes. Among patients of different groups, there were significant differences in overall survival rates. Both nomogram and risk score were independent predictors of HCC outcomes. Transcriptional addiction gene expression characteristics determine the sensitivity of patients to immunotherapy, cisplatin, and sorafenib. Besides, HDAC2 was identified as an oncogene, and its expression was correlated with patient survival time. Our study conclusively demonstrated that transcriptional addiction is crucial in HCC. We provided biomarkers for predicting the prognosis of HCC patients, which can more precisely guide the patient’s treatment.

Introduction

Primary liver cancer is the sixth most commonly diagnosed cancer worldwide [1]. China is a high-risk area for liver cancer, with the second-highest incidence rate among all malignant cancers [2]. Hepatocellular carcinoma (HCC) accounts for 75-85% of all liver cancers. There are several drug strategies for HCC, including chemotherapy, targeted therapy, and immunotherapy, but the drug response varies significantly among cancer patients [3, 4]. Considering the tumor heterogeneity and multiple risk factors, biomarkers for HCC should be individualized and diversified.

The dysregulation of transcription is crucial for tumorigenesis and development [5]. Transcriptional dysregulation caused by genomic and epigenetic alterations is called “transcriptional addiction” [6]. Transcriptional addiction in cancer cells can be overly dependent on various components of the transcriptional process [7]. Therefore, small molecule inhibitors targeting transcriptional processes may be more effective against these cancer cells [5]. Targeting transcriptional cyclin-dependent kinases and small molecule proteins is a promising treatment strategy that could contribute to the rational design of cancer combination therapies.

The oncogenic effect of transcriptional addiction has been demonstrated in several cancers. The efficacy of CDK7 inhibitors depends on the selective dependency of tumors on CDK7 kinase activity. For instance, CDK7 inhibitors could selectively target tumor cells and suppress the growth of breast tumors [7]. The repression of CDK7 may promote marked transcriptional dysregulation in ovarian cancer, which is not recognized as an actionable vulnerability [8]. Identifying transcriptional addiction genes may aid in developing HCC biomarkers and therapeutic strategies.

The present study systematically analyzed the expression levels and prognostic value of transcriptional addiction genes in HCC. Clustered subgroups and risk signatures provided predictable biomarkers of patients’ survival. Among these genes, HDAC2—an oncogene—was associated with patient survival. These findings substantiate the great potential of transcriptional addiction in HCC.

Results

Identification of differentially expressed genes

We identified a series of genes associated with cancer transcriptional addiction (n=38) through a focused literature review [639] (Table 1). We refer to these genes as transcriptional addiction genes.

Table 1. Literature review identified the transcriptional addiction genes.

Cancer typeGene list
Acute myelocytic leukemia [6, 9-15]DOT1L, BRD4, MEF2D, IRF8, KMT2A, ZMYND8, MEF2D, CDK9, ZFP64
T cell leukemia [6]CDK7, BRD4
Multiple myeloma [6]BRD4
Glioma [6, 16]PDGFRA, CDK7
Melanoma [7]YAP, TAZ
Neuroblastoma [17-19]TBX2, FOXM1, CDK7, CDK12,
Liposarcoma [20]FOSL2, MYC, RUNX1, SNAI2
Bone and soft tissue sarcomas [21]CDK7
Gastrointestinal Stromal Tumor [22]FOXF1, ETV1
Thyroid cancer [23-25]RUNX2, FOXC1, CDK7, PPP1R15A
Esophageal cancer [19, 26]HDAC1, HDAC2, TP63, SOX2, KLF5
Breast cancer [7, 18, 27-30]YAP, TAZ, BRD4, Skp2, EN1, TRPS1, CDK7, CDK12
lung cancer [18, 31-33]CDK7, SOX2, MYC, CDK12
Liver cancer [7]YAP, TAZ, SOX9
Gallbladder Cancer [34]CDK7
Pancreatic cancer [35]CDK7
Colon cancer [36]P53, CDK7
Bladder cancer [18, 37]CDK7, CDK12
Ovarian Cancer [8]CDK7
Prostate cancer [38, 39]CDK7, CDK9, CDK8, CDK19, CDK12, CDK13, MED1

We identified DEGs in HCC from three platforms. In the TCGA-LIHC set, 33 genes (86.8%) were differentially expressed in tumor tissues and paraneoplastic tissues (p < 0.05) (Figure 1A). We identified 33 and 26 DEGs in the ICGC-JP set and GEO-GSE14520 set, respectively (Figure 1B, 1C). The analysis results were consistent across the three platforms, with most genes overlapping (Figure 1D).

The differentially expressed genes (DEGs) were screened in different sets. (A) DEGs (n = 33) in TCGA-LIHC set. (B) DEGs (n = 33) in the ICGC-JP set. (C) DEGs (n = 26) in GEO-GSE14520 set. (D) Venn diagram for the overlapping genes.

Figure 1. The differentially expressed genes (DEGs) were screened in different sets. (A) DEGs (n = 33) in TCGA-LIHC set. (B) DEGs (n = 33) in the ICGC-JP set. (C) DEGs (n = 26) in GEO-GSE14520 set. (D) Venn diagram for the overlapping genes.

Transcription factor co-expression network and gene function analysis

There was a strong correlation (|R2| > 0.5 and p < 0.05) between the expression of 14 DEGs and tumor transcription factors (Figure 2A). The top three DEGs were KMT2A, MED1, and HDAC2. The expression of these 14 DEGs in cancer tissues was higher than that in paracancerous tissues. There were 314 transcription factors screened using the correlation analysis. We plotted the co-expression network using the Cytoscape v3.8.1 software (Figure 2B).

Transcription factor co-expression network and gene function analysis for TCGA-LIHC set. (A) The bar chart shows the number of transcription factors related to DEGs. (B) Co-expression network of 14 DEGs and 314 transcription factors. (C) Functional annotation for transcriptional addiction genes using GO term enrichment analysis, according to the top five biological processes, the top five cellular components, and the top five molecular functions. (D) The top 15 KEGG enrichment pathways of transcriptional addiction genes.

Figure 2. Transcription factor co-expression network and gene function analysis for TCGA-LIHC set. (A) The bar chart shows the number of transcription factors related to DEGs. (B) Co-expression network of 14 DEGs and 314 transcription factors. (C) Functional annotation for transcriptional addiction genes using GO term enrichment analysis, according to the top five biological processes, the top five cellular components, and the top five molecular functions. (D) The top 15 KEGG enrichment pathways of transcriptional addiction genes.

Functional enrichment analysis was performed based on 33 DEGs from the TCGA-LIHC set. The GO dataset enriched the regulation of stem cell differentiation, transcription initiation from RNA polymerase II promoter, and DNA−templated transcription (Figure 2C). As for the KEGG dataset, viral carcinogenesis, transcriptional misregulation in cancer, and microRNAs in cancer were enriched (Figure 2D). Transcription-related functions were enriched in both datasets.

Clustered subgroups and nomograms for predicting patient survival in the TCGA-LIHC set

For the best clustering effect, we chose k = 2 for the subsequent analyses (Figure 3A, 3B). The different clustering subgroups revealed a two-way distribution through PCA analyses (Figure 3C). According to the survival analysis, the patients in subgroup 1 exhibited higher one-, three-, and five-year overall survival rates than those in subgroup 2 (log-rank test p < 0.01) (Figure 3D).

Clustered subgroups of DEGs in TCGA-LIHC set. (A, B) K = 2 was chosen to construct the clustered subgroups. (C) Principal component analysis of transcriptional addiction genes to distinguish different clustered subgroups. (D) Survival analysis for the gene clusters based on 365 patients from TCGA-LIHC set.

Figure 3. Clustered subgroups of DEGs in TCGA-LIHC set. (A, B) K = 2 was chosen to construct the clustered subgroups. (C) Principal component analysis of transcriptional addiction genes to distinguish different clustered subgroups. (D) Survival analysis for the gene clusters based on 365 patients from TCGA-LIHC set.

The independent variables used to construct the nomogram include age, gender, grade, stage, and cluster (Figure 4A). The calibration curve and C-index confirm the high predictive accuracy and specificity of the nomogram (Figure 4B, 4C). In univariate/multivariate Cox regression analyses, the nomogram was an independent predictor of HCC outcome (HR > 1 and p < 0.001) (Figure 4D, 4E).

Nomogram for predicting patient survival in TCGA-LIHC set. (A) The nomogram combines the cluster and clinicopathological characteristics for predicting patient survival outcomes. (B) The calibration curve validates the sensitivity of the nomogram in 1, 3, and 5 years. (C) The C-index validates the specificity of the nomogram compared to the clinicopathological characteristics. (D, E) Univariate and multivariate regression analyses for the nomogram as an independent prognostic factor.

Figure 4. Nomogram for predicting patient survival in TCGA-LIHC set. (A) The nomogram combines the cluster and clinicopathological characteristics for predicting patient survival outcomes. (B) The calibration curve validates the sensitivity of the nomogram in 1, 3, and 5 years. (C) The C-index validates the specificity of the nomogram compared to the clinicopathological characteristics. (D, E) Univariate and multivariate regression analyses for the nomogram as an independent prognostic factor.

Identification of an 11-gene risk signature for predicting patient survival

The 11-gene risk signature was constructed using the training set (Table 2). Each patient’s risk score can be calculated using the following formula. Risk score = (-0.908 * expression value of EN1) + (0.214 * expression value of ETV1) + (0.238 * expression value of HDAC2) + (-0.058 * expression value of IRF8) + (0.017 * expression value of MYC) + (0.020 * expression value of SNAI2) + (0.030 * expression value of TAZ) + (-0.073 * expression value of TP53) + (-0.218 * expression value of TP63) + (-0.057 * expression value of YAP1) + (0.498 * expression value of ZFP64). There were two risk categories for patients: high-risk and low-risk. A cutoff point was determined based on the training set’s median risk score.

Table 2. Each genetic parameter of the risk signature.

mRNACoefficientHR95%CIP
EN1-0.9080.4030.178-0.9130.029
ETV10.2141.2391.063-1.4440.006
HDAC20.2381.2691.152-1.3970.001
IRF8-0.0580.9440.887-1.0040.068
MYC0.0171.0171.005-1.0300.008
SNAI20.0201.0200.997-1.0430.094
TAZ0.0301.0311.001-1.0610.043
TP53-0.0730.9300.879-0.9830.011
TP63-0.2180.8040.600-1.0770.143
YAP1-0.0570.9440.908-0.9820.004
ZFP640.4981.6451.295-2.0910.001

In all sets, there was a greater median overall survival time among the patients in the low-risk group compared to those in the high-risk group (p < 0.01) (Figure 5A5C). In the TCGA-LIHC and ICGC-JP cohorts, the area under the curve (AUC) for the risk score was higher than that for the other clinical factors (Figure 5D, 5E). In the GEO cohort, the AUC for the risk score demonstrated slightly lower performance (Figure 5F). The additional GEO verification set (GSE20140) is provided in the Supplementary Figure 1. This difference could be related to the patient’s geographical area, tumor stage, and other factors. The results of the univariate/multivariate Cox regression analyses were consistent across multiple cohorts, and they all supported that risk score was an independent risk predictor of patient survival (HR > 1 and p < 0.05) (Figure 6).

Predicting patient survival for the transcriptional addiction gene signature. (A) Survival analysis of different risk groups based on 365 patients from TCGA-LIHC set. (B) Survival analysis of risk groups based on 210 patients from ICGC-JP set. (C) Survival analysis of different risk groups based on 221 patients from GEO-GSE14520 set. (D) The time-dependent ROC curve of the risk score and clinicopathological characteristics in TCGA-LIHC set. (E) The time-dependent ROC curve of the risk score and clinicopathological characteristics in ICGC-JP set. (F) The time-dependent ROC curve of the risk score and clinicopathological characteristics in GEO-GSE14520 set.

Figure 5. Predicting patient survival for the transcriptional addiction gene signature. (A) Survival analysis of different risk groups based on 365 patients from TCGA-LIHC set. (B) Survival analysis of risk groups based on 210 patients from ICGC-JP set. (C) Survival analysis of different risk groups based on 221 patients from GEO-GSE14520 set. (D) The time-dependent ROC curve of the risk score and clinicopathological characteristics in TCGA-LIHC set. (E) The time-dependent ROC curve of the risk score and clinicopathological characteristics in ICGC-JP set. (F) The time-dependent ROC curve of the risk score and clinicopathological characteristics in GEO-GSE14520 set.

Independent prognostic value of transcriptional addiction gene signature. (A, B) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characteristics in TCGA-LIHC set. (C, D) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characters in ICGC-JP set. (E, F) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characters in GEO-GSE14520 set.

Figure 6. Independent prognostic value of transcriptional addiction gene signature. (A, B) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characteristics in TCGA-LIHC set. (C, D) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characters in ICGC-JP set. (E, F) Univariate and multivariate regression analyses for hazard ratio values of risk score and clinical characters in GEO-GSE14520 set.

Correlation of risk signature with therapeutic drugs and immune status

There were differences in TIDE and exclusion scores between risk groups (p < 0.001) (Figure 7A, 7B). Thus, high-risk patients were more likely to benefit from immunotherapy. Similarly, chemotherapy and targeted therapy are more effective in the high-risk group, considering the lower IC50 values in this group (Figure 7C, 7D). Both DNA and RNA stemness scores were positively correlated with the risk score (p < 0.001) (Figure 7E, 7F). This result could provide a reference for targeted cancer stem cell therapy. Finally, we identified several immune cell contents and immune function scores associated with risk signature, including T cells CD8, macrophages M0, cytolytic activity, MHC class I, type I IFN response, and type II IFN response (Figure 7G, 7H).

Correlation of transcriptional addiction gene signature with therapeutic drug sensitivity and immune status in TCGA-LIHC set. (A) TIDE scores in different risk groups. (B) Exclusion scores in different risk groups. (C) Cisplatin sensitivity in different risk groups. (D) Sorafenib sensitivity in different risk groups. (E) Correlation between DNA stemness score and risk score. (F) Correlation between RNA stemness score and risk score. (G) Immune cell contents in different risk groups. (H) Immune function scores in different risk groups. *p p p

Figure 7. Correlation of transcriptional addiction gene signature with therapeutic drug sensitivity and immune status in TCGA-LIHC set. (A) TIDE scores in different risk groups. (B) Exclusion scores in different risk groups. (C) Cisplatin sensitivity in different risk groups. (D) Sorafenib sensitivity in different risk groups. (E) Correlation between DNA stemness score and risk score. (F) Correlation between RNA stemness score and risk score. (G) Immune cell contents in different risk groups. (H) Immune function scores in different risk groups. *p < 0.05; **p < 0.01; ***p < 0.001.

HDAC2 expression and clinical value in HCC

By bioinformatics mining, HDAC2 expression was higher in the cancer tissues than that in the adjacent tissues in the TCGA-LIHC and ICGC-JP sets (p < 0.001) (Figure 8A, 8B). In the HPA database, immunohistochemistry staining also identified higher levels of HDAC2 protein in HCC tissues compared with those in normal liver tissues (Figure 8C). Tissue microarrays likewise demonstrated high levels of HDAC2 protein in tumor tissues compared with those of paracancerous tissues and distal tissues (Figure 8D).

Expression levels of HDAC2 in different tissues. (A) HDAC2 expression in normal and tumor tissues in the TCGA-LIHC cohort. (B) HDAC2 expression in normal and tumor tissues in the ICGC-JP cohort. (C) The protein level of HDAC2 in normal and tumor tissues in HPA database. (D) The protein levels of HDAC2 in tumor tissue, paracancerous tissue, and distal tissue in the tissue microarray.

Figure 8. Expression levels of HDAC2 in different tissues. (A) HDAC2 expression in normal and tumor tissues in the TCGA-LIHC cohort. (B) HDAC2 expression in normal and tumor tissues in the ICGC-JP cohort. (C) The protein level of HDAC2 in normal and tumor tissues in HPA database. (D) The protein levels of HDAC2 in tumor tissue, paracancerous tissue, and distal tissue in the tissue microarray.

In the TCGA-LIHC and ICGC-JP sets, the median HDAC2 expression value was used for grouping patients. The patients with high HDAC2 expression had a shorter median overall survival time than low expression patients (p < 0.001) (Figure 9A, 9B). The HDCA2 protein level was scored for each sample of the tissue microarrays [40, 41] (Figure 9C). Consistent results were observed across all survival analyses. The patients with higher ICH scores had shorter survival times (Figure 9D, 9E). HDAC2—an oncogene—may affect the survival of HCC patients. Furthermore, HDAC2 was up-regulated in HCC recurrent and metastatic tissues (Figure 10A, 10B). HDAC2 scores also differed in samples with different tumor sizes and stages (Figure 10C, 10D). HDAC2 score could be an independent prognostic factor for HCC patients (Figure 10E, 10F).

Relationship between HDAC2 expression and patient survival. (A) Kaplan-Meier survival analysis of patients with different HDAC2 expressions in TCGA-LIHC cohort. (B) Kaplan-Meier survival analysis of patients with different HDAC2 expressions in ICGC-JP cohort. (C) Scored for each sample of the tissue microarray. (D, E) Kaplan-Meier survival analysis of patients with different tissue scores in the tissue microarrays.

Figure 9. Relationship between HDAC2 expression and patient survival. (A) Kaplan-Meier survival analysis of patients with different HDAC2 expressions in TCGA-LIHC cohort. (B) Kaplan-Meier survival analysis of patients with different HDAC2 expressions in ICGC-JP cohort. (C) Scored for each sample of the tissue microarray. (D, E) Kaplan-Meier survival analysis of patients with different tissue scores in the tissue microarrays.

Correlation of HDAC2 score with HCC pathological characteristics in the tissue microarray set. (A) HDAC2 score in tumor recurrence and non-recurrence groups. (B) HDAC2 score in tumor metastatic and non-metastatic groups. (C) HDAC2 score in different tumor size groups. (D) HDAC2 score in different tumor stage groups. (E, F) Univariate and multivariate regression analyses for HDAC2 score as an independent prognostic factor.

Figure 10. Correlation of HDAC2 score with HCC pathological characteristics in the tissue microarray set. (A) HDAC2 score in tumor recurrence and non-recurrence groups. (B) HDAC2 score in tumor metastatic and non-metastatic groups. (C) HDAC2 score in different tumor size groups. (D) HDAC2 score in different tumor stage groups. (E, F) Univariate and multivariate regression analyses for HDAC2 score as an independent prognostic factor.

Discussion

The mortality rate for liver cancer is the fourth highest among all cancers [1]. Because of the lack of early-stage symptoms, most patients do not benefit from surgery at the initial diagnosis [4, 42]. The conventional treatment for HCC is ineffective, and many exploratory clinical trials have failed to meet their endpoints [43]. The development of HCC is a complex process involving multiple gene regulatory networks at the molecular level. The conventional early warning markers primarily focus on tumor volume and clinical symptoms, with insufficient sensitivity and specificity. In the precision medicine era, studies on genetic characteristics may provide us with new directions. A growing number of studies have proposed new biomarkers for liver cancer, such as the cancer driver gene [44], autophagy [45], and ferroptosis-related gene signature [46]. The present study identified novel transcriptional addiction gene-related signatures and nomograms. The transcriptional addiction gene signature demonstrated better sensitivity and specificity compared with conventional clinical factors when validated across multiple platforms.

The HCC genome is highly heterogeneous, with transcriptional dysregulation of many genes. However, transcriptional addiction has received little attention in the field of liver cancer. Francesca et al. [7] found that the oncogenic properties of BRD4 were dependent on the YAP/TAZ transcriptional response. BET inhibitors could limit the liver growth caused by YAP transcription. There are also several other transcriptional addiction genes. For example, CDK7 is a famous transcriptional addiction-related gene that affects the progression, metastasis, and prognosis of many cancers [6, 16]. SOX2 transcription is associated with cell growth in lung and esophageal cancers [26, 31]. Transcriptional addiction may be a vital physiological and pathological mechanism in various processes, including tumorigenesis, development, and metastasis.

Our study examined the expression characteristics of transcriptional addiction genes in HCC. We identified several DEGs. The functional role of most genes has not been a concern in HCC. For instance, high expression of DOT1L may be associated with poor prognosis in HCC patients by mining public cancer databases. As a transcriptional addiction gene, DOT1L-mediated transcriptional regulatory mechanisms were involved in developing and maintaining MLL leukemia [13]. Yu et al. [47] reported that TRPS1 could facilitate the invasion and proliferation of HCC cells. Overexpression of TRPS1 was associated with tumor size and stage. However, the specific mechanism and prognostic value of TRPS1 in HCC are still unclear. We compiled a list of the transcriptional addiction-related genes to serve as a reference for subsequent studies.

It is common for patients with the same tumor stage and receiving the same treatment to have different outcomes. Targeting key transcriptional addiction genes may facilitate the development of biomarkers and therapeutic strategies. Wei et al. [48] reported that ZFP64 was up-regulated in tumor tissues of immunotherapy-insensitive HCC patients. Inhibiting PKCα/ZFP64/CSF1 axis could overcome anti-PD1 resistance in HCC. ETV1 may facilitate the invasion and metastasis of HCC by up-regulating metastasis-related genes; PTK2 and c-MET inhibitors could significantly inhibit this process [49]. Our study found that different groups of patients responded differently to cisplatin, sorafenib, and immunotherapy, which could guide drug selection for patients.

In combination with public cancer databases and tissue microarrays, this study investigates the role of HDAC2 in HCC. Aberrant expression and mutations of HDACs in cancer have been widely reported. HDAC10 regulates the stemness characteristics of lung cancer cells [50]. HDAC10 also serves as a new therapeutic target in ovarian and gastric cancers [51, 52]. Some selective HDAC6 inhibitors, such as ricolinostat and ACY-1215, have been applied in clinical trials [53]. As for HDAC2, the related studies mainly focus on colorectal cancer (CRC). Tang et al. [54] found that transcriptional regulation associated with the p300/YY1/miR-500a-5p/HDAC2 signaling axis was a critical step that affects the proliferation of CRC cells. Cui et al. [55] reported that HDAC2 promoted CRC cell proliferation by regulating GLI2 expression. HDAC2 may also facilitate multiple biological behaviors in hepatocellular carcinoma. Jin et al. [56] found that HDAC2 inhibition could enhance HCC radiosensitivity. Nam et al. [57] reported that HDAC2 was involved in HCC progression through feedback control of mTOR and AKT. HDAC2 may also affect the M2 macrophage migration and immune escape in HCC [58]. Our study demonstrated that high expression of HDAC2 could contribute to poor patient outcomes. HDAC2—an oncogene—contributes to metastatic and recurrent HCC.

Conclusions

Cancer develops and metastasizes via transcriptional addiction, which is common and critical. As demonstrated in our study, transcriptional addiction genes were implicated in HCC. Transcriptional addiction-related nomograms and signatures could provide significant predictions of survival time for patients. Gene expression characteristics determined the sensitivity of patients to therapeutic agents. HDAC2 was highly expressed in tumor tissues, and its level was associated with patients’ survival. Multiple databases confirmed our findings. Our study identified new biomarkers that can be used to predict survival and select suitable therapeutic drugs for HCC patients.

Materials and Methods

Data sources

This study’s data were obtained from three publicly available databases and three tissue microarrays. We obtained HCC RNA-seq and clinical information from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/), International Cancer Genome Consortium (ICGC) (https://icgc.org/), and Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) databases. The TCGA-LIHC set, ICGC-JP set, and GEO-GSE14520 set included 365, 210, and 221 HCC cases, respectively. Specific clinical information is provided in the Supplementary Table 1. The protein levels of transcriptional addiction genes in normal and tumor tissues were analyzed using the Human Protein Atlas (HPA) database. RNA-seq profiles were normalized using the “limma” and “sva” packages in R v4.2.1 software (https://www.r-project.org/).

Tissue microarray 1 (Outdo Biotech, Shanghai) contains tumor tissue, paracancerous tissue, and distal tissue from three HCC patients. The other tissue microarrays (Servicebio, Wuhan) contain tumor tissues from 152 HCC patients and their corresponding survival data. Tissue microarrays were used to verify the correlation between HDAC2 expression and patient survival.

Differentially expressed genes in different sets

The differentially expressed genes (DEGs) were identified by comparing transcriptional addiction gene expression in tumor and paracancerous tissues. The cutoff was set as p < 0.05. We visualized the overlapping genes from different datasets using Venn plots (https://bioinformatics.psb.ugent.be/webtools/Venn/).

Construction of cancer transcription factor co-expression network and gene function analysis in the TCGA-LIHC set

Cistrome Cancer (http://cistrome.org/) provides a list of cancer transcription factors. We analyzed the correlation between 318 cancer transcription factors and DEGs in the TCGA-LIHC set. The threshold value was p < 0.05 and |R2| > 0.5. The co-expression network of these genes was visualized using Cytoscape v3.8.1. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) suggest a range of functional and pathway sets in which DEGs may influence the biological behavior of HCC. This process was performed using the “enrichplot” R package.

Construction and validation of clustered subgroups and nomogram in the TCGA-LIHC set

Gene clustering was performed using the “ConsensusClusterPlus” package according to the expression of DEGs in the TCGA-LIHC set. The distribution among those cases was mapped using principal component analysis (PCA). Survival analysis verifies whether there were differences in survival rates between subgroups of patients. Further, clinical data and clustered subgroups were integrated to construct a nomogram using the “nomogram” package. The calibration curve and concordance index (C-index) were used to verify the specificity and predictive accuracy of the nomogram. The nomogram combined with clinicopathological characteristics was included for univariate and multivariate regression analyses.

Construction and validation of transcriptional addiction gene signature

With the TCGA-LIHC set as the training set, we developed a risk signature using multivariate Cox proportional regression. The ICGC-JP and GEO sets were defined as test sets. All cases were divided into high-risk and low-risk groups based on the median risk score of the training set. Survival analysis, receiver operating characteristic (ROC) curve, and univariate/multivariate regression were used to analyze the prognostic value of risk score in HCC patients.

The tumor immune dysfunction and exclusion (TIDE) score (http://tide.dfci.harvard.edu/) reflects the sample’s sensitivity to immunotherapy. We analyzed the sensitivity of different risk samples to immunotherapy, cisplatin, and sorafenib treatment. We evaluated the correlation between risk score and stem cell content by calculating DNA and RNA stemness scores. Finally, we evaluated the immune cell contents and immune function scores of different samples through the CIBERSORT algorithm (https://cibersortx.stanford.edu/).

Tissue microarrays and immunohistochemistry detection

We extracted the HDAC2 expression profiles of all cases from the TCGA and ICGC datasets and analyzed the distribution of HDAC2 in tumor tissues and paracancerous tissues. Tissue microarrays and the HPA database were used to verify HDAC2 protein levels in various types of tissues. We performed immunohistochemistry (IHC) of HDAC2 on the tissue microarrays. We incubated the tissue microarrays with the primary antibody against HDAC2 (Servicebio, GB11371, Wuhan), followed by the secondary antibody (Servicebio, GB23303, Wuhan) and 3,3’-diaminobenzidine (DAB) IHC kit (DAKO, K5007, Denmark). The intensity of HDAC2 staining was scored as 0, 1, 2, and 3 for no, low, medium, and high staining, respectively. We pooled these data and analyzed the correlation between HDAC2 protein levels and survival time of HCC patients in TCGA, ICGC, and tissue microarray sets. Univariate and multivariate regression analyses were also performed.

Statistical analysis

Expression or score differences between groups were analyzed using the Wilcoxon rank-sum test. Spearman’s rank correlation coefficient was used to determine the correlation between risk and stemness scores. For survival analysis, the Kaplan-Meier and log-rank tests were used. The independent prognostic value of the nomogram, risk signature, and HDAC2 score was performed with univariate/multivariate regression analyses. All statistical analysis procedures were performed using R v4.2.1 software.

Availability of data and material

Publicly available cancer databases were obtained from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/) and International Cancer Genome Consortium (ICGC) (https://icgc.org/) databases.

Consent for publication

All authors have read and approved the content and agree to submit for consideration for publication in the journal.

Author Contributions

Conceived and designed the study: GL, TSC. Collected the literature: XWD, HW. Wrote the manuscript: GL, XWD, JX, YFZ. Revised the manuscript: XWD, GL, JX, TSC, HW. All authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding

This study was funded by the Talents Training Program of the Seventh People’s Hospital, Shanghai University of Traditional Chinese Medicine (Grant No. XX2022-06), the Shanghai Municipal Health Commission (Grant No.202040180, No.202040187), and the Key Discipline Program of the Shanghai Pudong New Area Health Commission (No. PWZxk2022-13).

References

View Full Text Download PDF


Copyright © 2025 Rapamycin Press LLC dba Impact Journals
Impact Journals is a registered trademark of Impact Journals, LLC