Risk stratification of lung adenocarcinoma using a nomogram combined with ferroptosis-related LncRNAs and subgroup analysis with immune and N6-methyladenosine modification

Background Determining the prognosis of lung adenocarcinoma (LUAD) is challenging. The present study aimed to identify prognostic ferroptosis-related long noncoding RNAs (FRLs) and construct a prognostic model. Moreover, differential analysis of immune and N6-methyladenosine (m6A)-related genes was systematically conducted. Methods A total of 504 patients selected from a dataset from The Cancer Genome Atlas were included. The patients with LUAD were randomly divided into a training group and a test group at a ratio of 1:1. Pearson correlation analysis and univariate Cox regression analysis were used to identify the prognostic FRLs. Then, a prognostic model was constructed from the optimized subset of prognostic FRLs based on the least absolute shrinkage and selection operator (LASSO) algorithm. Subsequently, the receiver operating characteristic (ROC) curve and survival analysis were used to evaluate the performance of the model. The risk score based on the prognostic model was analyzed using Cox regression analysis. Moreover, gene set enrichment analysis and differential analysis of immune- and m6A-related genes were conducted. Results After univariate Cox regression analysis and LASSO algorithm analysis, a total of 19 prognostic FRLs were selected to construct the final model to obtain the risk score. The area under the ROC curve of the prognostic model for 1-year, 3-year, and 5-year overall survival (OS) was 0.763, 0.745, and 0.778 in the training set and 0.716, 0.724, and 0.736 in the validation set, respectively. Moreover, the OS of the high-risk group was significantly worse than that of the low-risk group in the training group (P < 0.001) and in the test group (P < 0.001). After univariate and multivariate Cox regression analysis, the risk score [hazard ratio (HR) = 1.734; P < 0.001] and stage (HR = 1.557; P < 0.001) were both considered significant prognostic factors for LUAD. A nomogram was constructed based on clinical features and risk score. The expression of 34 checkpoint genes and 13 m6A-related genes varied significantly between the two risk groups. Conclusion This study constructed a prognostic model to effectively predict the OS of patients with LUAD, and these OS-related FRLs might serve as potential therapeutic targets of LUAD. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-022-01164-5.

Background Lung cancer is a malignancy with high mortality and the second highest incidence worldwide in most countries, with approximately 235,760 new lung cancer cases diagnosed and approximately 131,880 related deaths in 2021 [1]. Generally, the five-year survival of patients with lung cancer ranges from 10 to 20% [2]. Lung adenocarcinoma (LUAD) is the subtype of lung cancer with the highest prevalence [3,4]. Despite advancements in surgery, radiotherapy, chemotherapy, and other advanced therapeutics, the prognosis remains poor due to the high heterogeneity of LUAD [5,6]. As such, research into the molecular mechanisms and pathogenesis of LUAD requires further research to develop promising therapies.
Ferroptosis is a specific type of cell death that is typically associated with the accumulation of iron and peroxidation and has been identified as a promising intervention in cancer therapeutics to trigger apoptosis of malignancies that are resistant to traditional methods [7][8][9]. Ferroptosis-related genes (FRGs) is genes that can promote, prevent or indicate the occurrence of ferroptosis, which annotated as drivers, suppressors or markers [10]. Some studies have even noticed the potential role of FRGs in lung cancer development and suppression, but the specific details remain elusive [11][12][13]. Long noncoding RNAs (lncRNAs) are a type of noncoding RNA sequence measuring approximately 200 nucleotides in length, and these sequences affect several processes, such as cell growth and apoptosis [14]. The regulation of lncRNAs in ferroptosis has been investigated and found to be related to different malignancies, including lung cancer [15,16]. However, there are only a few studies on the mechanism of how ferroptosis-related lncRNAs (FRLs) act on the occurrence and progression of LUAD. As such, studies investigating the involvement of FRLs in the development of LUAD may be of great value for identifying potential targets for guided treatment. Separately, N6-methyladenosine (m6A) modification is an important method of RNA posttranscriptional modification in most eukaryotic mRNAs and lncRNAs [17,18]. Many studies have indicated that m6A modification is related to the oncogenesis and progression of malignant tumors, including LUAD, and could regulate ferroptosis [19][20][21][22]. Therefore, it is essential to study these m6A modifications to comprehensively understand the involvement of FRLs in the development of LUAD.
With these important markers in mind, we sought to identify the prognostic value of FRLs using bioinformatics and a statistical analysis of data collected from patients with LUAD from The Cancer Genome Atlas (TCGA) dataset. In addition, a prognostic model will be created using an optimistic subset of FRLs to predict the long-term mortality of LUAD patients and used to construct a nomogram for guiding clinical judgment. Subgroup analysis along with immune and m6A modification will be used in conjunction to further determine the performance of the model and the expression of related genes within the study groups.

Datasets and FRGs
The study protocol was conducted in accordance with the Declaration of Helsinki. Informed consent was waived owing to the retrospective nature of this study. Transcriptome profiling data [fragments per kilobase of transcript per million mapped reads normalized] and the relevant clinical data were obtained from the Genomic Data Commons Data Portal (https:// portal. gdc. cancer. gov). The inclusion criteria were as follows: (1) LUAD confirmed by pathology; (2) patients with LUAD who had complete transcriptome profiling data; and (3) patients with LUAD who had complete survival data. The exclusion criteria were as follows: incomplete information of the transcriptome profiling data and clinical data. A total of 504 patients with LUAD were included in this study from the TCGA cohort. A total of 259 FRGs (382 annotations) were included in this study, which were downloaded from the FerrDb database (http:// www. zhoun an. org/ ferrdb) and could be divided into drivers, markers and suppressors. The annotation of the lncRNAs in the TCGA dataset was conducted by the annotation file of Genome Reference Consortium Human Build 38 (GRCh38), which was downloaded from the GENCODE website. A total of 14,056 lncR-NAs were identified based on recognizing the ensemble IDs of the genes in the TCGA datasets. The flow chart of study is shown in Fig. 1.

Ferroptosis-related lncRNA (FRL) selection and differential expression gene (DEG) analysis
FRLs were initially identified based on the expression of lncRNAs and FRGs by Pearson correlation analysis with |Pearson R|> 0.5 and P < 0.001. The different expression levels of FRGs and FRLs were analyzed by Wilcoxon tests between the normal group and the tumor group with log2fold change (FC) > 1 and false discovery rate (FDR) < 0.05. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of these DEGs were performed. Then, the prognostic FRLs were determined by univariate Cox regression analysis from the significantly different expression of FRLs (P < 0.05).

Construction of prediction model and validation
The training and test sets of patients with LUAD were randomly classified at a ratio of 1:1. The final prognostic model was constructed using the optimized subset of prognostic FRLs chosen by the least absolute shrinkage and selection operator (LASSO) algorithm. The risk score (RS) was computed by the chosen lncRNA expression weighted by their corresponding coefficients. The prognostic model was then assessed by ROC analysis and a risk plot. Using the median RS, those with LUAD were separated into either high-or low-risk groups. The OS of patients with LUAD in the two-risk group was shown by the Kaplan-Meier (KM) method in the training and test sets and the comparison was conducted. The RS and clinical features of patients with LUAD were analyzed by univariate and multivariate Cox regression analysis. The differential analysis of each clinical group based on the two-risk group was conducted by the chi-square test. The development of the nomogram was conducted based on RS and clinical features and evaluated by ROC analysis.
Gene set enrichment analysis (GSEA) and differential immune analysis GSEA was conducted by GSEA software (version 4.1.0) in the two risk groups of LUAD. A p value < 0.05 and the method of signal2noise were selected to denote enrichment significance. Then, the TIMER [23], CIBERSORT [24], CIBERSORT-ABS, QUANTISEQ [25], Microenvironment Cell Populations-counter (MCPCOUN-TER) [26], XCELL [27] and Estimating the proportion of Immune and Cancer cells (EPIC) [28] tools were used to estimate the abundances of immune cells between the two risk groups. In addition, single sample GSEA was used to quantify the immune cells and the pathways between the two risk groups. The expression levels of immune checkpoint-related genes in the two groups were compared by the Wilcoxon test.

Statistical analysis
R statistical software (version 4.0.5) was used for all statistical analyses. Student's t test, Mann-Whitney U test or Kruskal-Wallis H-test was used for continuous variables. A chi-square test or Fisher's exact test was used for categorical variables. The "limma" package was used to analyze transcriptome profiling data. The "clusterProfiler", "org.Hs.eg.db", "enrichplot", and "ggplot2" packages were used to conduct the GO analysis and KEGG analysis and build bar plots and bubble plots. The "survival", "survminer", "caret", "glmnet" and "timeROC" packages were used to conduct the survival analysis, build the prognostic model and obtain the ROC curve. The "ggalluvial", "ggplot2", and "dplyr" packages were used to develop the Sankey diagram. The "pheatmap" package was used to develop heatmaps and risk plots. The "survival" and "regplot" packages were used to develop the nomogram.
The "GSVA", "GSEABase", "ggpubr", and "reshape2" packages were used to develop box plots for the immune function analysis and the differential expression of checkpoint-related genes and m6A-related genes between the two-risk group. A two-sided P < 0.05 was considered statistically significant.

Identification of FRLs in LUAD patients
A total of 1949 FRLs were determined by Pearson correlation analysis to be significant (with |Pearson R|> 0.5 and P < 0.001). The expression of 70 FRGs and 664 FRLs was significantly different between the normal group and the tumor group. With regard to biological processes, GO was used to enrich the DEGs by biological process (BP), cellular component (CC), and molecular function (MF), as shown in Fig. 2a-b. With regard to BP, these DEGs were mainly enriched in response to oxidative stress and cellular response to chemical stress. With regard to CC, these DEGs were mainly enriched in the NADPH oxidase complex. With regard to MF, these DEGs were mainly enriched in superoxide-generating NADPH oxidase activity. Furthermore, we used KEGG to conduct pathway analysis on these DEGs and found that the DEGs were mainly enriched in ferroptosis and fluid shear stress and atherosclerosis, as shown in Fig. 2c-d. Then, a total of 55 prognostic FRLs were selected by univariate Cox regression analysis (with P < 0.05).

Establishment of predictive model and survival analysis
The 504 patients were randomly divided into a training set and a validation set. There were no significant differences in clinical characteristics between the training and validation cohorts (Table 1). Then, 19 prognostic FRLs selected by the LASSO algorithm were used to calculate the RS, and the formula is shown below (Fig. 3a-b).  (Table 2). Then, a nomogram based on RS and stage was established, and an ROC curve was drawn (Fig. 3g, Additional file 1). According to the median value of the RS, the survival status was shown based on the two risk groups, and the OS of the high-risk group was significantly poorer than that of the low-risk group in both the training set (P < 0.001) and test set (P < 0.001), as shown in Fig. 4a-h. The optimized subset of prognostic FRLs in the coexpression network and the relationship with FRGs were visualized using the circle map and Sankey diagram ( Fig. 5a-b). The differential analysis of each clinical feature in the two risk groups is shown in the heatmap, and the stage of patients with LUAD was significantly different (Fig. 5c).

GSEA
GSEA revealed that the set of genes was enriched in the cell cycle, pyrimidine metabolism and RNA degradation for the high-risk group (Fig. 6a-c). In the low-risk group, the gene set was mainly enriched for asthma, intestinal immune network for IgA production and hematopoietic cell lineage (Fig. 6d-f).

Differential immune analysis of RS
A heatmap of immune infiltration based on the TIMER, CIBERSORT, CIBERSORT-ABS, QUANTISEQ, MCP-COUNTER, XCELL and EPIC tools is shown in Fig. 7a. The results of the comparative analysis of immune function between the two risk groups are shown by the boxplot (Fig. 7b). The expression of 34 checkpoint genes was significantly different between the two risk groups (Fig. 7c).

Differential expression analysis of m6A
The difference in the expression level of m6A-related genes between the two risk groups is shown in Fig. 7d. The expressions of 13 m6A-related genes were significantly different between two risk groups. The expression levels of VIRMA, HNRNPC, HNRNPA2B1, IGF2BP1, IGF2BP2, IGF2BP3 and LRPPRC were significantly higher in the high-risk group (P < 0.001). The expression level of YTHDC2 was significantly lower in the high-risk group (P < 0.001).

Discussion
After univariate Cox regression analysis and the LASSO algorithm, 19 prognostic FRLs were established in the model for predicting the OS of patients with LUAD. Those with LUAD were separated by the median value of the RS. The high-risk group showed significantly worse OS than the low-risk group (P < 0.001). Univariate and multivariate Cox regression revealed that a valuable significant risk factor was RS. The results of the differential immune analysis indicated that the RS was correlated with immune cell infiltration in our study. In addition, the differential expression of m6A-related genes in the two-risk group could indicate that FRLs were associated with these m6A-related genes in the prognosis of patients with LUAD. This study is the first to comprehensively analyze FRLs in the prognosis of patients with LUAD, immune cell infiltration and m6A modification.
Many studies have shown that lncRNAs can regulate ferroptosis in the oncogenesis and progression of malignant tumors [29][30][31]. A study by Zhang et al. showed that OIP5-AS1 could promote cell growth and inhibit ferroptosis in prostate cancer, which functioned as a competing source of endogenous RNAs (ceRNAs) for miR-128-3p to regulate the expression of SLC7A11 [29]. A study by Mao et al. showed that P53RRA could lead to ferroptosis by retaining p53 in the nucleus and serves as a tumor suppressor by displacing p53 from a cytosolic G3BP1 complex [30]. A study by Wu et al. showed that NEAT1 could modulate the expression of ACSL4 in NSCLC, which was a FRL in previous reports, and further affect the ferroptosis process [31]. These studies indicated that FRLs could contribute to the tumorigenesis and progression of malignant tumors, regulate ferroptosis and influence the invasion of tumors. However, few studies have examined how FRLs affect the progression of LUAD. In this study, the gene set of FRLs from patients with LUAD was mainly enriched in the cell cycle, pyrimidine metabolism and RNA degradation in the high-risk group, which had significantly worse OS than the low-risk group. These results indicated that FRLs could affect the progression of patients with LUAD and the OS of patients with LUAD.
In our study, 19 prognostic FRLs were confirmed to build the prediction model of patients with LUAD. A total of 8 out of 19 FRLs, which included LINC00582, MED4-AS1, AC090559.1, AL161618.1, FAM30A, MHENCR, KTN1-AS1 and AL606489.1, were previously investigated in tumors [32][33][34][35][36][37][38][39]. The results of both Wu et al. and Guo et al. showed that the expression of AC090559.1 was higher and the expression of AL606489.1 was lower in the low-risk group, and these two lncRNAs were both considered significant predictive factors for the OS of patients with LUAD [32,33]. These findings are in line with the results of our study. The study of Li et al. showed that KTN1-AS1 expression was upregulated in NSCLC tissue and was positively correlated with poor prognosis, which could be considered a ceRNA for miR-130a-5p to regulate NSCLC cell growth and apoptosis [34]. These findings are also in line with the results of our study. In addition, Lima et al. showed that the expression of FAM30A was positively associated with  FRLs. c: The figure shows that the patients in the two-risk group were significantly different in these subgroups of patients with stage of disease. The heatmap also showed the 19 FRL expressions in the two-risk group. *P < 0.05, **P < 0.01, ***P < 0.001 the level of antibody titers and the lncRNA might regulate immunoglobulin heavy locus gene segments [35]. Furthermore, several studies have shown that abnormal immune cells contribute to the progression of LUAD and regulate ferroptosis, and checkpoint inhibitor-based immunotherapies have improved the survival of patients with advanced malignancies [40][41][42][43][44]. In our study, significant differences in immune cell infiltration and function and the expression of immune checkpoints between the two risk groups were also confirmed to a certain extent. A parallel study by Yao et al. identified 7 FRLs that had diagnostic utility in OS for patients with LUAD, and the results showed that the AUCs of the risk model based on 7 FRLs for 1 year, 2 years, and 3 years were 0.711, 0.658, and 0.676 in the training set and 0.593, 0.577, and 0.525 in the validation set [45]. A net of 7 FRL differences was determined in their study, while a net of 19 FRL differences was determined in our study. What's more, stage III had a slightly higher point than stage IV in the nomogram of our study. The reason may be the differences between several sample sets and the TCGA cohort and insufficient sample size. However, overwhelming evidence suggests that the identified FRLs hold high diagnostic utility in the oncogenesis of LUAD, and further studies along this direction are needed. On a separate note, several studies have investigated the relationship between m6A modification and the prognosis of LUAD [22,46,47]. A study by Ma et al. demonstrated that YTHDC2, a m6A-related gene, could suppress HOXA13 by m6A modification to suppress SLC3A2 and further promote ferroptosis in LUAD [22]. In addition, a study by Xu et al. demonstrated that the risk model, which was considered an independent indicator for the prognosis of LUAD, was composed of 12 m6A-related lncRNAs, and the AUC of the risk model was 0.759 [46]. It is evident that modification of m6A was able to influence the progression of cancers. Our study showed that the expression levels of 13 m6A-related genes between the two-risk group were significantly different. The results indicated that m6A modification could play a potential role in FRLs in the prediction of the prognosis of patients with LUAD, and further research is needed to obtain more details about undying associations. While significant findings were reported above, limitations to these findings are described below. First, the use of independent LUAD cohorts or multiple databases would increase the diagnostic power and improve the nomogram over a single TCGA dataset. Second, in vitro and in vivo experiments should be used in conjunction to determine the mechanistic function of FRLs in oncogenesis.

Conclusions
In summary, this study determined the prognostic value of FRLs in LUAD patients and used these findings to generate a prognostic model to effectively predict the OS of patients with LUAD. Moreover, a nomogram for a predictive prognostic model was built, and the differential expression of m6A-related genes in the two-risk group was confirmed. As such, the FRLs identified herein may suggest therapeutic targets for the treatment of LUAD.