Gene regulatory network models to identify regulators of MITF
We developed a transcription factor (TF) MITF target gene regulatory network model. We wanted to identify TFs that best explain different MITF expression levels over a set of different cell lines. For this, we predicted the expression of the MITF gene in a sample j by minimizing the differences e
mitf,j between real MITF expression g
mitf,j and predicted expression \( {\tilde{g}}_{\mathrm{mitf},\mathrm{j}} \) realized by the constraints
$$ {g}_{\mathrm{mitf},\mathrm{j}}-{\tilde{g}}_{\mathrm{mitf},\mathrm{j}}-{e}_{\mathrm{mitf},\mathrm{j}}\le 0 $$
(1)
$$ -{g}_{\mathrm{mitf},\mathrm{j}}+{\tilde{g}}_{\mathrm{mitf},\mathrm{j}}-{e}_{\mathrm{mitf},\mathrm{j}}\le 0 $$
(2)
The predicted MITF expression was based on the linear equation
$$ {\tilde{g}}_{\mathrm{i},\mathrm{j}}={\beta}_0+{\displaystyle {\sum}_{\mathrm{t}=1}^T{\beta}_t\cdot {es}_{t,i}\cdot {eff}_{t,j}} $$
(3)
with an additive offset β0. T is the number of all TFs with available information on target genes, βt is the optimization parameter for TF t, and est,i was calculated from an integration of binding data in order to only account for TFs that are known or predicted to regulate MITF. est,i is equal to or greater than 1 if we had experimental and computational evidence of TF t binding to gene i using the databases Metacore™ (http://thomsonreuters.com/metacore/), ChEA [23], Encode (http://www.genome.gov/Encode/) and Transcription Factor Binding Affinity (TBA) [24] (details are given in the next section and at [22]). eff
t,j
is the effect of TF t in sample j and was calculated by the activity of a TF based on its cumulative effect on its target genes, i.e.
$$ {eff}_{tj}={act}_{tj}=\frac{{\displaystyle {\sum}_{\mathrm{i}=1}^n}{es}_{ti}\cdot {g}_{\mathrm{i}\mathrm{j}}}{{\displaystyle {\sum}_{\mathrm{i}=1}^n}{es}_{ti}} $$
(4)
With the use of a branch and cut based optimization program (Gurobi™ 5.5, http://www.gurobi.com/) to solve the Mixed Integer Linear Programming (MILP) problem, the β-coefficients were calculated in order to minimize the sum of differences between measured and predicted MITF expression for all samples (objective function). The MITF model was restricted to a defined number of regulators from the set of all putative regulators (19 TFs). We applied a bottom-up approach to identify the most important regulators of the model, starting with restricting the model to one regulator. Within each of the following runs, one additional regulator was added to the model. The optimizer selected independently in every run the best regulators in order to minimize the objective function. The prediction performance of each model was estimated by the correlation between real and predicted MITF expression in the test data (unseen data, not used for learning the model) based on a leave-one-out cross validation (LOO-CV). For details regarding the MILP model and activity definition see Schacht et al. [22].
Binding evidence
As described previously [22], we used several sources to assess TF binding information. From the database MetaCoreTM (http://thomsonreuters.com/metacore/) human TF-target gene interactions were selected, of both of the categories direct and indirect. Additionally, we used z-scores of the Total Binding Affinity (TBA) which are calculated TF binding profiles for the whole promoter based on position weight matrices [24, 25]. Moreover, human entries of the CHIP Enrichment Analysis (ChEA) database were used containing large data sets of high-throughput chromatin immunoprecipitation experiments [23]. At the date of analysis (July 2013) the ChEA database for man comprised of 83 transcription factors, 20,035 genes and 131,996 total entries. In addition, we used chromatin immunoprecipitation data from the ENCODE project (http://www.genome.gov/Encode/). We used binding information of cell lines for which the most comprehensive set of regulation information was available (Tier 1). Binding of a transcription factor to a target gene as listed in Encode, was scored as “1” or if absent, as “0”, respectively. Target genes occurring more than once, were combined in single rows containing consistent (intersecting) hits and transcription factors showing up multiple times were assembled into one column as the union of hits. Information on regulatory transcription factor/target gene interaction was considered reliable if (i) this pair was found in Metacore with the annotation “direct”, or if (ii) this pair was found in at least two of the datasets Metacore “indirect”, CheA, Encode and TBA with a value greater or equal to one. For these TF/target gene pairs, their putative regulatory interaction was denoted edge strength est,i between TF t and target gene i, and set to the number of occurrences of the specific TF/target gene combinations among the datasets CheA, Metacore “direct” activation, Metacore “direct inhibition”, Metacore “indirect activation” and Metacore “indirect inhibition”. TBA values greater or equal to one were added to the edge strength. For all TF/target gene pairs missing criteria (i) or (ii), the edge strength was set to zero, i.e. this TF/target gene pair was not considered by our prediction algorithms. The binding information of SOX5/MITF interaction was taken from Metacore where it was annotated as “direct inhibition”. In addition, the z-score of TBA of SOX5 binding to the human MITF promoter was strongly positive (z = 1.5, see Additional file 1: Figure S1).
Gene expression data
To identify prominent transcription factors of MITF with our regulatory network model, we used the gene expression profiles of 59 cancer cell lines from the National Cancer Institute (NCI-60 panel), which comprises 60 cancer cell lines from nine different cancer types (breast, central nervous system, colon, kidney, leukemia, lung, melanoma, ovary and prostate). The data were downloaded from CellMiner and based on an integration of five different microarray platforms (5-Platform, Affymetrix HG-U95, HG-U133, HG-U133 Plus 2.0, GH Exon 1.0 ST, and Agilent WHG) yielding a z-score for each gene of each sample (details, see [26]). Missing values were replaced by the mean expression values of the according genes. The cell line SF 539 was excluded from our analysis because of a large number (N = 10,404) of undefined entries. Subsequently, we continued the analysis of MITF’s TFs on a second, independent dataset, to see whether our findings are consistent and reproducible. Therefore, we used gene expression data from melanoma cells taken from a study by Hoek et al. [16, 17]. In brief, melanoma cells were released from tissue sections of melanoma metastases. Cells were cultured, total RNA was extracted, labeled and their transcriptome profiled using Affymetrix HG-U133 plus 2.0 oligonucleotide microarrays. Raw intensity signals were normalized employing Affymetrix MAS 5.0. Values below 0.01 were set to 0.01 and each value was divided by the 50th percentile of all values in that sample. Each expression value was divided by the median of its values in all samples. Finally, expression values were z-normalized for each gene. For our analysis, we used expression data from 33 samples from the Mannheim cohort of the study by Hoek and coworkers (subsequently denoted as the Mannheim cohort). Cell lines from this panel were also used for our in vitro experiments. For inferring clinical and expression data, we used skin cutaneous melanoma (SKCM) samples from the Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/). Clinical as well as MITF, SOX5 and SOX10 mRNA expression (RNA Seq V2 RSEM) data were downloaded from the cBio portal (http://www.cbioportal.org/). The SKCM expression data were z-normalized. For the comparison of expression levels between non-survived and survived subgroups Wilcoxon rank sum tests were applied, because the distribution of the expression levels was not normally distributed. All data sets used are publically available.
Cell culture
Five melanoma cell lines used in the Hoek and coworkers analysis [16, 17] MaMel-122, MaMel-86b, MaMel-61e and MaMel-79b (own laboratory) as well as A375 purchased from ATCC were cultured at 37 °C and 5 % CO2 in RPMI 1640 medium (Gibco, Carlsbad, CA, USA) + 10 % FCS in general without antibiotics. MaMel-122-pMITF-GFP was cultured in medium containing 0.5 μg/ml puromycin (Sigma-Aldrich, Steinheim, Germany). These cell lines were chosen because they exhibit substantial expression of MITF, SOX5 and SOX10.
siRNA transfection and qRT-PCR
To investigate the effects of SOX5 and SOX10 on MITF expression levels, Ambion® Silencer® Select Pre-designed (Inventoried) siRNAs (Life Technologies, Carlsbad, CA, USA) were utilized to knock down these transcription factors. For the knock-down of SOX5 or SOX10 siRNA s13303 (Antisense sequence, no overhangs: UCCUUUCACACCGUAAGUG) and siRNA s13308 (Antisense, no overhangs: UCCUUCUUCAGAUCGGGCU) were used, respectively.
For validation of siRNA mediated knock-down effects and to diminish off-target effects, defined, high complexity SOX5 and control siRNA pools consisting of 30 individual siRNAs each (siTOOLs Biotech, Planegg, Germany) [27] were included. Melanoma cells were seeded in 12-well plates and cultured for 24 h to reach 70–80 % confluency. Transfections were performed according to the manufacturer’s instructions (DharmaFect transfection reagent; GE Healthcare, Little Chalfont, United Kingdom) using 25 nM single siRNAs and 10 nM for siRNA pools. Forty eight h post transfection, cell pellets were collected and stored at -80 °C until RNA was isolated with the miRNeasy Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. Reverse transcription was performed using the Transcriptor First Strand cDNA Synthesis Kit (Roche Applied Science, Mannheim, Germany). 500 ng of total RNA was reverse transcribed in a 20 μl reaction utilizing oligo(dT) primers. The cDNA was diluted (1:5) with PCR-quality water (Sigma-Aldrich, Steinheim, Germany) and 2 μl of the cDNA dilution was used for qRT-PCR in a 20 μl reaction using the TaqMan® Universal PCR Mastermix (Applied Biosystems, Foster City, CA, USA). The qRT-PCR was performed for MITF, SOX5 or SOX10 as the gene of interest (GOI) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) as the housekeeping gene (HK) using the TaqMan probes HS01117294_m1 MITF, HS00753050_s1 SOX5, HS00366918_m1 SOX10 (Life Technologies, Carlsbad, CA, USA) and HuGAPDH (Applied Biosystems, Foster City, CA, USA), respectively. The qRT-PCRs were run on an Applied Biosystems 7300 Real Time PCR system. For all samples, three technical replicates were performed for both MITF and GAPDH. Median Ct values for MITF and GAPDH were calculated based on three technical replicates for the different samples. ∆Ct values were calculated according to
$$ \varDelta Ct=C{t}_{MITF}-C{t}_{GAPDH} $$
(5)
to normalize the MITF level to the control (GAPDH). ∆∆Ct values were calculated according to
$$ \varDelta \varDelta Ct=C{t}_{transfected}-C{t}_{control} $$
(6)
to normalize the sample transfected with siRNA against SOX5 or SOX10 mRNA to the control condition. Finally, the fold change was calculated according to
$$ Foldchange={2}^{-\varDelta \varDelta Ct} $$
(7)
MITF-promoter reporter assay
Stable transfection of MaMel-122 cells was performed to generate a cell line that expresses the green fluorescence protein (GFP) gene downstream of the MITF promoter.
The human MITF promoter was amplified from the plasmid pMI, kindly provided by Dr. Ballotti [28], with the following primers:
The amplified promoter was cloned into pLenti CMV GFP Puro (pLenti CMV GFP Puro (658-5) was a gift from Eric Campeau; Addgene plasmid # 17448) [29]. For this purpose, the CMV promoter was cut from pLenti CMV GFP Puro with ClaI and XbaI and the MITF promoter was introduced at the same position. A plasmid map of the used vector MITFP-pLenti can be found in the supplement (Additional file 1: Figure S2). Functional validation of the vector was performed in primary human melanocytes in comparison to human fibroblasts (Additional file 1: Figure S3). Melanocytes and fibroblasts were isolated following standard protocols from skin remainings after operations such as foreskins after circumcisions of healthy donors. Successfully MITFP-pLenti transfected cells were positively selected using 0.5 μg/ml puromycin.
The generated cell line was denoted by MaMel-122-pMITF and was constantly kept under selective pressure. MaMel-122-pMITF was used to investigate the role of SOX10 and SOX5 in regulating MITF at the transcriptional level. siRNA transfection experiments were performed analogous to the qRT-PCR analyses. 1 · 105 cells were seeded in 24-well plates and cultured for 24 h. Then, the wells were transfected with either SOX10 siRNA, SOX5 siRNA, non-targeting control siRNA or a mixture of both, i.e. SOX10 and SOX5 siRNA. The final concentration of each siRNA per well was 25 nM. The cells were harvested 72 h after transfection. The cells were detached from each well with 50 μl trypsin and resuspended in 150 μl of medium. After centrifugation, the cell pellets were washed once with 200 μl PBS and three times with 1 ml ice cold FACS buffer. Finally, the pellets were dissolved in 200 μl FACS-buffer and fluorescence measurements were performed using a BD FACSCaliburTM (BD Biosciences) flow cytometer using channel FL-1 to detect GFP. Unstained MaMel-122 cells were included in each individual measurement as a negative control. The analysis of the flow cytometry data was conducted using FlowJo version 9.6.4 (http://www.flowjo.com/).
Proliferation assay
The effect of SOX5 on cell viability was assessed using CellTiter-Glo Luminescent Cell Viability Assay (Promega, Fitchburg, WI, USA) after transfection with 10 nM control or SOX5 siRNA pools. 1 x 104 cells (fast growing) and 2 x 104 cells (slow growing) cells were seeded per well in 96 well black/clear flat bottom plates (Corning, Corning NY). Viability was measured according to the manufactures instructions 24, 48 and 72 h after transfection. Three biological replicates were performed for each condition.
Invasion assay
Invasion assays were performed 48 h after transfection with SOX5 or control siRNA pool (10 nM) in 24 well plate format. Therefore, 5 x 104 cells resuspended in 50 μl serum-free medium (three technical replicates) were pipetted into the upper insert of a 96 well transwell plate (Corning, Corning NY) coated with 50 ng matrigel/well. The lower chambers were filled with 150 μl medium + 10 % FCS as a chemoattractant. After 24 h, invaded cells were detached from the membrane, washed, stained with calcein AM (Thermo Fisher Scientific, Waltham, MA) and analyzed with a fluorometer according to the manufactures protocol.
Statistical analysis
Statistical significance was calculated using the one-sided two-sample Student’s t-test and a Wilcoxon rank sum test was performed for non-normally distributed populations (comparing the distribution of SOX5 expression between different subgroups of SKCM data (survived vs. non-survived, thin vs. thick)). P-values less than 0.05 were considered statistically significant. For the Kaplan-Meier analysis the cutoffs for low and high expression were determined by a 10-fold cross-validation approach using the R-package maxstat [30]. The median cutoff was used to classify samples into low and high expression subgroups and nonparametric log-rank tests were used to assess significance. All statistical analyses were performed using R version 3.0.1 (http://www.r-project.org/) and Microsoft Excel 2013.
Prediction of Breslow thickness
A linear model consisting of the expression for the three transcription factors SOX5, MITF and SOX10 was optimized by minimizing the differences between measured Breslow thickness t
j
for sample j and predicted Breslow thickness \( \widetilde {t_{\mathrm{j}}} \) via minimizing the sum of error terms e
j
:
$$ {\displaystyle {\sum}_{j=1}^l\kern0.5em \mid {t}_j-{\tilde{t}}_j\mid \kern0.5em =}\kern0.5em {\displaystyle {\sum}_{j=1}^l{e}_j} $$
(8)
The Breslow thickness was predicted for melanoma patient sample j using the linear model
$$ {t}_j={\beta}_0+{\beta}_{SOX5}\cdot ef{f}_{SOX5,j}+{\beta}_{SOX10}\cdot ef{f}_{SOX10,j}+{\beta}_{MITF}\cdot ef{f}_{MITF,j} $$
(9)
with β
0
as an additive offset, β
TF
as the optimization parameter for the TF (SOX5/SOX10/MITF) and eff
TF,j
as the estimated effect of a TF in sample j. As effect eff
Tf,j
, we used the gene expression of the TF in sample j.