- Database
- Open access
- Published:
Glioma-BioDP: database for visualization of molecular profiles to improve prognosis of brain cancer
BMC Medical Genomics volume 16, Article number: 168 (2023)
Abstract
Cancer researchers often seek user-friendly interactive tools for validation, exploration, analysis, and visualization of molecular profiles in cancer patient samples. To aid researchers working on the both low- and high-grade gliomas, we developed Glioma-BioDP, a web tool for exploration and visualization of RNA and protein expression profiles of interest in these tumor types. Glioma-BioDP is user friendly application that include expression data from both the low- and high-grade glioma patient samples from The Cancer Genome Atlas and enabled querying by mRNA, microRNA, and protein level expression data from Illumina HiSeq and RPPA platforms respectively. Glioma-BioDP provides advance query interface and enables users to explore the association of genes, proteins, and miRNA expression with molecular and/or histological subtypes of gliomas, surgical resection status and survival. The prognostic significance and visualization of the selected expression profiles can be explored using interactive utilities provided. This tool may also enable validation and generation of new hypotheses of novel therapies impacting gliomas that aid in personalization of treatment for optimum outcomes.
Background
Gliomas are the most common types of brain cancers originating in the glial cells. Initially the adult diffuse gliomas were classified according to the microscopic resemblance of the tumor cells with the normal glial cells [1]. In 2000, the World Health Organization (WHO) classified diffuse gliomas into the histological subtypes: astrocytic tumors, oligodendrogliomas, and oligoastrocytomas [2]. These were then graded for their degree of malignancy. Oligoastrocytomas and oligodendrogliomas are graded into grade II or III, while astrocytomas are graded into grades II, III and IV, the grade IV being known as glioblastomas (GBM) and the lower grades are referred as lower grade gliomas (LGG) [3]. Current classification of gliomas is based on genomic alterations in addition to the histopathological classification which are indicative of the aggressiveness of the tumor and patient prognosis [4]. Particularly, for lower grade gliomas (stage II, III), the mutation status of the isocitrate dehydrogenase 1 or 2 (IDH1/2) genes and codeletion in 1p19q is considered the key factors for molecular subtyping.
The availability of multi-omic cancer patient data from The Cancer Genome Atlas (TCGA) project has provided cancer researchers with unprecedented opportunities to explore and analyze molecular profiles in relation to patient survival, cancer stage, metastatic state, and other clinical factors [5]. However, downloading the bulk data and writing scripts for analysis and visualization of the data is a daunting task for most clinicians. Web tools like cBioPortal addresses these issues by incorporating interactive visualization and exploration of genetic profiles in different cancers [6]. However, cBioPortal offers limited exploration of cancer-subtype specific changes. While the new version of cBioPortal allows to visualize gene expression changes in glioma histological or molecular subtypes using boxplots, it is not possible to stratify samples by clinical parameters to check the difference in survival related to gene expression change in a specified glioma subtype. To aid the cancer researcher working on gliomas we developed Glioma-BioDP, as an extended version of our previously published web tool GBM-BioDP [7]. Glioma-BioDP enables users with enhanced functions to query gene, protein, and miRNA expression profiles related to molecular subtypes, driver gene alteration status, histological subtypes as well as surgical resection status and patient survival.
Construction and contents
Experimental and clinical data
In the current version of Glioma-BioDP, we collected RNA-seq (v2) and miRNA-seq data from TCGA via Genomic Data Commons (GDC) data portal. We normalized the gene expression count data from RNA-seq and miRNA-seq platforms into log counts per million using the R package edgeR [8]. The normalized data was used for generating all the statistics we show for the expression distribution characteristics. The raw RPPA protein expression data were downloaded from (https://tcpaportal.org/tcpa/). We obtained both level 3 and level 4 data. All the protein expression data were processed and analyzed in a similar way. The gene expression heatmaps were generated using the z-score normalized data for each gene across the samples. Samples and genes were clustered using unsupervised hierarchical clustering with Pearson’s correlation coefficient used as distance/similarity metric.
Database construction
In the back end, we built a MySQL relational database. All the data presented by the Glioma-BioDP portal are retrieved from the in-house database that hosts information on clinical annotation, subtype information, gene, protein, and miRNA expression. The database also stored all the essential metadata, including the gene, miRNA, and sample information. The patient stratification showed in the portal for both TCGA-GBM and TCGA-LGG is based on the annotation reported by Verhaak and colleagues [9].
The Glioma-BioDP is a PHP based web application. The runtime high level architecture is 3-tiered, consistent with our previous released GBM-BioDP [7]. Processing is done in Python (http://www.python.org/) and visualization is developed using R (http://www.r-project.org/). The application is deployed on an Apache HTTP server (http://httpd.apache.org/) at the National Cancer Institute (NCI).
Utility and discussion
Modules
The Glioma-BioDP webtool has three modules: a) GBM, b) LGG and c) GBM vs LGG, as shown in Fig. 1a. Within each module, sub-modules exist to explore expression profiles and clinical details. Sub-module “GENES” include both mRNA and protein level expression from the TCGA Illumina HiSeq and RPPA platforms respectively. Sub-module “MIRNAs” contain expression data from TCGA Illumina HiSeq platform. The details of modules are described below.
Core features
-
1)
GBM module: Functionalities and options has been previously described in our publication [7].
-
2)
LGG module: Provide the query and visualization of expression profiles from genes, proteins, or miRNAs from LGG. The visualization panel contains patient stratification with several options as follows:
-
a)
IDH wild type (IDHwt) vs. IDH mutation (IDHmut)
-
b)
1p/19q codeletion status
-
c)
IDH and 1p/19q codeletion status
-
d)
histological subtypes including Astrocytoma, Oligoastrocytoma, Oligodendroglioma
-
e)
surgical resection status of gross total or sub-total.
-
f)
Histology and surgical resection for each of the LGG subtypes.
-
a)
For all the options mentioned above Kaplan-Meier survival plots of expression profiles can be stratified by greater vs. less than mean, and four expression quartiles (1st vs 4th, 1-2 vs 3-4, 1st vs 2-4, 1-3 vs 4th).
The visualization of gene or miRNA expression profiles between any of the above-mentioned stratifications can be visualized with a density plot and a box plot. The p-value for the difference of expression levels between these stratifications are calculated using t-tests. Prognostic significance of the gene or miRNA expression in any of the above-mentioned stratification is visualized with Kaplan-Meier survival curves stratified by four expression quartiles as described above.
-
3)
GBM vs LGG module: To explore prognostic significance of gene expression in GBM vs. LGG. Visualization is shown in a side-by-side comparison with Kaplan–Meier survival plots. The patient samples can be stratified by greater vs less than mean, and four expression quartiles of the queried gene or miRNA as described above.
For all three modules heatmaps for multiple query genes are shown to visualize their gene expression clustering with respect to clinically relevant parameters like molecular subtypes and the prognostic index.
Workflow and applications of Glioma-BioDP
Glioma-BioDP facilitates the user to assess the expression pattern and prognostic potential of desired gene in specific brain tumor, i.e. Glioblastoma (GBM) or Low-grade gliomas (LGGs) or both. Here, we describe the brief workflow of analyses using a clinically relevant gene PTEN as an example. The tumor suppressor gene PTEN plays important roles in the regulation of cell proliferation, apoptosis, and DNA damage repair [10]. Treatment of PTEN-deficient tumors with PI3K pathway inhibitors are being investigated for some cancer types [11]. The loss of PTEN expression has been indicated to be an early event in glioma, with mutations occurring between 5 and 40% of glioma cases.
To get comprehensive understanding about the role of a selected gene (e.g. PTEN) in LGG subtypes from diverse perspective, we have integrated different molecular types such as mRNA, protein expression, molecular and histological subtypes-based stratification of tumors. Complete workflow for the analysis of PTEN in molecular and histological subtypes using LGG module is represented in the Fig. 1a. Here, we have selected mRNA and protein expression-based query for the analysis of PTEN (steps 1–4). Glioma-BioDP allows the user to select any of desired options provided on the platform. Subsequently, the user will be directed to the tabular display of the gene and protein expression pattern in LGG samples. By clicking on the plots the user will be directed to the graphical interface (see steps 5–6). Figure 1b shows the histogram distribution and box plots of differences PTEN mRNA expression in IDH-mut vs. IDH-wt. The 3 survival plots in Fig. 1b shows the difference between IDH-mut vs. IDH-wt (p-value < 0.05), within each subset, if there is a difference between greater or less than median expression. Like the mRNA expression-based query, protein expression-based query can be performed to see differences in histogram, boxplots, and survival between different molecular subtypes of LGG (Fig. 1c). Importantly, Glioma-BioDP allows the user to rebuild their survival models employing different parameters based on molecular features: IDH mutation and 1p/19q codeletion status, histological subtypes, surgical resection status, varying quartile ranges, etc. The resulting KM plots from the stratification of samples using histological type and resection status for mRNA and protein expression-based queries are shown in Fig. 1d and e respectively.
Case studies
The Glioma-BioDP tool’s functionality and potential clinical relevance is demonstrated through the analyses of the following genes: PTEN, NES, TERT, MGMT and EGFR. The clinical relevance of PTEN in gliomas is explained in the previous section. NES is a gene that codes for nestin, an intermediate filament found in vascular endothelial cells that is upregulated in tumors to allow for increased angiogenesis at the tumor site [12]. TERT codes for telomerase reverse transcriptase, a protein key to the maintenance of telomeres and one whose expression is upregulated in a subset of gliomas through promoter mutation or by other means to facilitate tumor progression [13]. MGMT promoter methylation and subsequent MGMT gene inactivation is common in malignant gliomas. Epigenetic MGMT promoter methylation has been shown to be associated with better clinical outcomes for patients treated with temozolomide (TMZ) and radiotherapy due to a decrease in tumor DNA repair [14]. This observation has great clinical relevance in terms of patient selection for chemo and radiotherapy. EGFR amplification and mutation are a signature genetic abnormality in GBM [15] and may be explored as a therapeutic target. Using these examples, we described the prognostic significance of each of these genes in LGG and GBM.
Prognostic value of the expression of PTEN in LGG subtypes
Expanding on the PTEN gene search in LGG module, the potential clinical relevance of overexpression of PTEN mRNA level was shown to be associated with better survival in IDHwt but not in IDHmut LGGs (Fig. 2a, b, Kaplan–Meier analysis, p-value = 0.043 and 0.386 respectively, samples stratified by > median or < median of PTEN expression). Also, querying PTEN expression in 1p19q co-deleted vs non-co-deleted LGGs showed that PTEN overexpression is associated with better survival in 1p19q non-co-deleted LGGs but not in 1p19q co-deleted LGGs (Fig. 2c, d, Kaplan–Meier analysis, p-value = 0.004 and 0.421 respectively, samples stratified by > median or < median of PTEN expression). Further, querying for histological subtypes of LGG, it was seen that PTEN overexpression was associated with better survival in all histological subtypes: oligodendroglioma, astrocytoma, and oligoastrocytoma (Fig. 2e, g, Kaplan–Meier analysis, p-value = 0.03, 0.024 and 0.014 respectively, samples stratified by > median or < median of PTEN expression). There are investigations underway for finding treatment strategies for targeting PTEN-deficient cancers. Association of PTEN overexpression with improved survival in glioma subtypes with poor prognosis (IDHwt and 1p19q non-co-deleted) may provide rationale for investigating the effects of these therapies in these glioma subtypes.
Prognostic value of the expression of genes NES, TERT and MGMT in LGG vs GBM
The webtool can showcase genes that are manipulated either LGG or GBM or equally in the GBM and LGG tumor microenvironments. NES is an example of a gene that elucidates the Glioma-BioDP webtool’s ability to identify genes that have a significant effect on prognosis in LGG but not GBM. From Fig. 3a, b, elevated NES expression levels are shown to be associated with decreased survival times in LGG but not GBM (p-value: 0.003 vs. p-value: 0.998, 1st vs. 4th Quarter analysis).
TERT is an example of the genes that have prognostic significance in both LGG and GBM. TERT expression levels are shown in Fig. 3c, d to have a similar effect on patient survival times in both GBM and LGG (p-value: 0.036 vs p-value: 0.043, 1st vs 4th Quarter analysis).
MGMT shows the webtool’s ability to highlight genes that show profound impact and significance in glioblastoma (GBM) but not low-grade glioma (LGG). The beneficial effect of MGMT promoter methylation and gene inactivation is corroborated by Glioma-BioDP in Fig. 3e, f, showing that decreased MGMT protein expression is associated with longer survival times in the clinical setting as the tumor progresses (p-value; 0.309 in LGG vs p-value: 0.042 in GBM, below and above median analysis).
Prognostic value of the expression of EGFR in molecular subtypes of LGG and GBM
To explore the prognostic effects of EGFR expression in molecular subtypes of LGG and GBM, we queried Glioma-BioDP. When we queried for EGFR expression in GBM subtypes on the GBM module, it could be observed that the mRNA and protein expression of EGFR is significantly higher in the classical subtype of GBM compared to all other subtypes (Fig. 4a shows mRNA expression boxplot, classical vs other subtypes p-value < 0.001). Though patient stratification by EGFR expression within each molecular sub-type do not show significant association with survival, in the proneural subtype strong trend is seen for association of better survival with overexpression of EGFR (Fig. 4b, Kaplan–Meier analysis, p-value = 0.098, sample stratification by 1st quartile vs 2–4 quartiles). Patient survival in the other molecular subtypes of GBM did not show any association with EGFR expression level (Fig. 4c-e). On the other hand, in LGG subtypes stratified by the presence of IDH mutation (wild type IDH or IDHwt vs mutated IDH or IDHmut), it could be seen that EGFR mRNA (TCGA RNA-Seq data) and protein (TCGA RPPA data) expression is significantly higher in IDHwt samples compared to IDHmut samples, as visualized using density plots and box plots for protein expression (Fig. 4f, p-value < 0.001). From the Kaplan–Meier survival plots, it could be seen that overexpression of EGFR protein level is associated with better survival in IDHwt (Fig. 4g), but not in the IDHmut subtype of LGG.
Thus, the functionality of Glioma-BioDP is evident through the juxtaposition of these genes whose effect on patient outcomes is widely different yet similarly potent in either GBM or LGG or both settings. Also, querying Glioma-BioDP enables users to explore the mRNA and protein level profiles for their genes of interest in context of the molecular and histological subtypes.
Conclusions
In the age of big data in cancer genomics, there is an opportunity for cancer researchers to use and explore the patient genomic data from large tumor cohorts such as TCGA, to improve their understanding of genomic correlates to patient prognosis. However, there is a need for availability of the data in easy to explore format and intuitive visualization that would enable the cancer researchers to make use of that enormous data. Glioma-BioDP as a user-friendly web tool offers intuitive visualization and query of gene and miRNA expression data in gliomas in context of specific histological and molecular subtypes of these tumors. In addition to our previously published tool GBM-BioDP, the new tool Glioma-BioDP enables exploration of the prognostic significance of transcriptomic and proteomic features from low grade to high grade gliomas in subtype-specific manner.
In comparison to a previously published tool GliomaDB [16], our tool Glioma-BioDP offers more intuitive and useful visualizations by enabling the users to look at gene or miRNA expressions between different histological, as well as molecular subtypes of gliomas. As described in our case studies, with Glioma-BioDP users get useful information on the survival of the glioma patients depending on the queried gene expression in context of the histological and/or molecular subtypes of gliomas. Including this information is a critical feature of Glioma-BioDP as the subtypes are linked to varied degree of patient prognosis in glioma, and the expressions of different genes may have different implications in prognosis depending on the glioma subtype. An example of this functionality is described by the association of PTEN overexpression with better survival in IDHwt and 1p19q non-co-deleted LGGs. Another example is that EGFR gene and protein level expressions are associated with prognosis in the glioblastomas (grade IV glioma), and even in the lower grade gliomas (grade II-III), EGFR protein level expression is associated better prognosis with the IDHwt molecular subtype which is the high-risk subtype compared to IDHmut.
In upcoming version of this tool, we are integrating miRNA and protein expression data into the LGG and LGG vs GBM modules. In the near future we are looking forward to incorporate data from more omic platforms like mutation, copy number and DNA methylation that would expand the usability of this tool.
Availability of data and materials
Availability: Glioma-BioDP web tool with user manual is available from: https://glioma-biodp.nci.nih.gov
Contact: uma@mail.nih.gov.
References
Louis DN, Holland EC, Cairncross JG. Glioma classification: a molecular reappraisal. Am J Pathol. 2001;159(3):779–86. https://doi.org/10.1016/S0002-9440(10)61750-6. Published Online First: Epub Date.
Gonzales M. The 2000 World Health Organization classification of tumours of the nervous system. J Clin Neurosci. 2001;8(1):1–3. https://doi.org/10.1054/jocn.2000.0829. Published Online First: Epub Date.
Louis DN, Ohgaki H, Wiestler OD, et al. The 2007 WHO classification of tumours of the central nervous system. Acta Neuropathol. 2007;114(2):97–109. https://doi.org/10.1007/s00401-007-0243-4. Published Online First: Epub Date.
Louis DN, Perry A, Reifenberger G, et al. The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary. Acta Neuropathol. 2016;131(6):803–20. https://doi.org/10.1007/s00401-016-1545-1. Published Online First: Epub Date.
Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45(10):1113–20.
Cerami E, Gao J, Dogrusoz U, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4.
Celiku O, Johnson S, Zhao S, Camphausen K, Shankavaram U. Visualizing molecular profiles of glioblastoma with GBM-BioDP. PLoS One. 2014;9(7):e101239. https://doi.org/10.1371/journal.pone.0101239. Published Online First: Epub Date.
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. https://doi.org/10.1093/bioinformatics/btp616. Published Online First: Epub Date.
Ceccarelli M, Barthel FP, Malta TM, et al. Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell. 2016;164(3):550–63. https://doi.org/10.1016/j.cell.2015.12.028. Published Online First: Epub Date.
Smith JS, Tachibana I, Passe SM, et al. PTEN mutation, EGFR amplification, and outcome in patients with anaplastic astrocytoma and glioblastoma multiforme. J Natl Cancer Inst. 2001;93(16):1246–56. https://doi.org/10.1093/jnci/93.16.1246. Published Online First: Epub Date.
Dillon LM, Miller TW. Therapeutic targeting of cancers with loss of PTEN function. Curr Drug Targets. 2014;15(1):65–79. https://doi.org/10.2174/1389450114666140106100909. Published Online First: Epub Date.
Maderna E, Salmaggi A, Calatozzolo C, Limido L, Pollo B. Nestin, PDGFRbeta, CXCL12 and VEGF in glioma patients: different profiles of (pro-angiogenic) molecule expression are related with tumor grade and may provide prognostic information. Cancer Biol Ther. 2007;6(7):1018–24. https://doi.org/10.4161/cbt.6.7.4362. Published Online First: Epub Date.
Brennan CW, Verhaak RG, McKenna A, et al. The somatic genomic landscape of glioblastoma. Cell. 2013;155(2):462–77. https://doi.org/10.1016/j.cell.2013.09.034. Published Online First: Epub Date.
Hegi ME, Diserens AC, Gorlia T, et al. MGMT gene silencing and benefit from temozolomide in glioblastoma. N Engl J Med. 2005;352(10):997–1003. https://doi.org/10.1056/NEJMoa043331. Published Online First: Epub Date.
Taylor TE, Furnari FB, Cavenee WK. Targeting EGFR for treatment of glioblastoma: molecular basis to overcome resistance. Curr Cancer Drug Targets. 2012;12(3):197–209. https://doi.org/10.2174/156800912799277557. Published Online First: Epub Date.
Yang Y, Sui Y, Xie B, Qu H, Fang X. GliomaDB: a web server for integrating glioma omics data and interactive analysis. Genom Proteom Bioinform. 2019;17(4):465–71. https://doi.org/10.1016/j.gpb.2018.03.008. Published Online First: Epub Date.
Acknowledgements
This work was funded by the Intramural Research Program of the National Institutes of Health, National Cancer Institute.
Availability and requirements
Project name: Glioma-BioDP.
Project home page: https://glioma-biodp.nci.nih.gov
Operating system(s): CentOS 7.
Other requirements: php8.2 or higher.
License: No license needed.
Any restrictions to use by non-academics: open to all and no restrictions for non-academics.
Funding
Open Access: Funding for this article is provided by the Intramural Research Program of the National Institutes of Health, National Cancer Institute. Open Access funding provided by the National Institutes of Health (NIH).
Author information
Authors and Affiliations
Contributions
US and KC conceived the project. XD developed the web tool. SD performed background analysis of data. SD and HK performed case studies. US supervised the study. SD, XD, HK, EW, US and KC wrote the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
None declared.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Deng, X., Das, S., Kaur, H. et al. Glioma-BioDP: database for visualization of molecular profiles to improve prognosis of brain cancer. BMC Med Genomics 16, 168 (2023). https://doi.org/10.1186/s12920-023-01593-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12920-023-01593-w