Acquisition of the tissue with CMT (T-CMT) and the tissue without CMT (T-control)
This study was approved by the Institutional Research Board of Ajou University Medical Center, Suwon, South Korea. Twenty-eight subjects (23 subjects with CMT and 5 subjects without CMT) were finally enrolled in this study. For the microarray or quantitative real-time PCR (QRT-PCR) and MRI analyses, 26 subjects with CMT originally participated in this study. Myectomy was done at the lower end of the SCM about 1 centimeter above the clavicle. All the 26 subjects who agreed to participate in this study understood that the removed muscle blocks from the surgery would be used for the analyses. Each muscle block from 26 subjects was divided into the T-CMT and the T-control by the third author's visual assessment of their gross appearance. While the reddish part with a normal muscle appearance was considered as T-control, the whitish cord-like part was considered as T-CMT. Since the muscle blocks of 4 subjects had only T-CMT, those subjects were not enrolled. Another 4 subjects showed poor quality of their RNA and they were also excluded. Therefore, 18 subjects were finally enrolled in these analyses. For the imunohistochemical (IHC) analysis, additional 5 subjects with CMT were enrolled. Muscle block were obtained as the same procedure described above. Independent normal SCM muscles for the IHC were obtained from the Ajou Human Bio-Resource Bank, which provided the normal SCM muscle of 5 subjects who underwent radical neck dissection for the head and neck tumor.
RNA isolation from both the T-CMT and the T-control
The muscle tissue was stored in RNAlater RNA Stabilization Reagent (Applied Biosystems/Ambion, Austin, TX, USA) immediately after surgery to preserve the RNA. The total RNA was isolated from the tissue using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions.
Genome-wide mRNA expression profiling using microarray
Seven paired microarray experiments (a total of 14) were done using an Affymetrix GeneChip Human Gene 1.0 ST Array (Affymetrix, Santa Clara, CA, USA) which offers whole-transcript expression profile. 300ng of total RNA extracted from each sample was used as input into the Affymetrix procedure as recommended by the manufacture's protocol (http://www.affymetrix.com). Robust Multiarray Averaging (RMA) method was used for microarray normalization and summarization. When multiple probes per gene were available, we averaged the values of corresponding probes. We applied a quantile normalization method across samples.
Identification of the DEGs of CMT
The fold change of the expression level was calculated as follows:
where T-CMT* means the expression level of a gene of the T-CMT, and T-control† means the expression level of the same gene of the T-control. DEGs were identified when genes met the following two conditions at once: 1) genes showing a significant difference of expression between the T-CMT and the T-control (p < 0.05; using a Student t test), and 2) genes showing more than |2| fold change between the T-CMT and the T-control in more than half subjects (here, >3).
Examination of the discriminant power of the DEGs between the T-CMT and the T-control
Principal component analysis (PCA) was done using MATLAB R2007a (MathWorks Inc., Natick, MA, USA). The clustering methods such as a k-means clustering method and a hierarchical clustering method were done using the TM4 software [10].
Gene ontology enrichment analysis
We used functional annotation tools called DAVID (the Database for Annotation, Visualization and Integrated Discovery) [11], and GOEAST (the Gene Ontology Enrichment Analysis Software Toolkit) [12] to find enriched gene ontology terms in identified DEGs.
Identification of the CMT-related protein network modules
We set up the protein-protein interactions (PPIs) and protein-DNA interactions (PDIs) for Homo sapiens. For the PPIs, we used the data of Lee et al. [13] and the results of several recent genome-wide studies [14–17]. The PPIs consist of 80,970 interactions among 10,819 human proteins. For the PDIs, we extracted 1,539 interactions using the TRANSFAC database [18]. To discover the CMT-related PPIs or PDIs, the prepared PPIs, PDIs and the list of the identified DEGs from this research were imported into Cytoscape (http://www.Cytoscape.org) with their fold change [19]. MCODE was used to find the CMT-related protein network modules [20]. Network score was calculated based on complexity and density of each sub-graph. A module with more than 1 MCODE score was considered significant. Post filtering was performed to remove low-quality modules. In the filtering process, the part of each module shown consistent expression and high connectivity were selected as a final module through manual review. Finally, modules which include at least one protein encoded by DEG and other proteins their association with CMT was previously known were selected as CMT-related protein network modules. For selected five CMT-related modules, network ontology analysis (NOA) published by Wang et al. which perform gene ontology analysis on network module was conducted to define function of the five CMT-related modules [21]. A GO term of which p-value was less than 0.1 was considered significant.
Quantitative real-time PCR of DEGs
Genome-wide mRNA expression profiling using microarray was validated by QRT-PCR. We selected 8 among 269 DEGs for the validation. 7 were DEGs which show top difference expression in fold change and the t-test (p value) in the microarray study. The other one is S100A4, which was a component of one of CMT-related protein network modules and showed key role as the first split point in decision tree model discriminating T-CMT and T-control (Figure S1 in Additional file 1). Eight DEGs were as follows: thrombospondin 4 (THBS4), fibromodulin (FMOD), collagen, type XIV, alpha 1 (COL14A1), cathepsin K (CTSK), epidermal growth factor (EGF)-like repeats and discoidin I-like domains 3 (EDIL3), lysyl oxidase (LOX), secreted frizzled-related protein 4 (SFRP4), and S100 calcium binding protein A4 (S100A4). QRT-PCR was done for 11 paired (T-CMT and T-control) muscle tissues (a total of 22 specimens) from 11 independent subjects. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as an internal control. R2 of the linear regression analysis was used for evaluating the degree of correlation between the microarray and the QRT-PCR expression levels. See the Table S1 in Additional file 1 for the primers used here.
Correlation between the expression levels of QRT-PCR and the color intensity of lesions on MRI
Eleven subjects with CMT were recruited in both QRT-PCR study and the pre-operational MRI study. The fold changes from the QRT-PCR described above were used as the gene expression level. The difference of grey color intensity between the SCM with CMT (SCM-CMT) and the contralateral SCM (SCM-control) on the pre-operational neck MRI was used as an indicator of the radiological severity. In terms of pathology, CMT is interstitial fibrosis with/without aberrant tendon-like dense connective tissue. Fibrosis will lead to a reduction of the mobile proton (hydrogen ion) density, and so this will show as a darker grey color with a lower scale on both the T1- and T2-weighted MR images. Thus, darker grey color with a lower scale means much fibrosis within the SCM-CMT. We measured the mean intensity of the SCM-CMT on the axial T1 weighted, pre-operational MRI image that showed lowest signal intensity using the region of interest (ROI) method. The mean intensity of the SCM-control was measured on the same axial T1 weighted image using the same ROI. The mean intensity of each SCM was divided by that of its corresponding SCM-control for normalization. Therefore, the difference between the SCM-CMT and the SCM-control was calculated as follows for the independent 11 subjects used in the QRT-PCR analysis:
Difference of grey color intensity = (the mean intensity of the SCM-control - the mean intensity of the SCM-CMT)/the mean intensity of the SCM-control.
Next, we compared the QRT-PCR expression level of the identified 8 DEGs with the difference of grey color intensity of the 11 subjects. R2 of the linear regression analysis was used for evaluating their degree of correlation.
Immunohistochemical examination
For immunohistochemistry, 4-µm thick sections of formalin-fixed, paraffin-embedded tissue blocks were cut from the SCM muscles with CMT and the normal SCM. Sections were deparaffinized in xylene, rehydrated in graded alcohols, followed by antigen retrieval. The endogenous peroxidase activity was blocked by Hydrogen Peroxide Block (Thermo Fisher Scientific, Fremont, CA). Sections were then placed in an automated IHC stainer (Lab Vision Autostainer LV-1; Thermo Fisher Scientific, Fremont, CA) for immunohistochemistry and incubated at 4℃ with primary antibodies against elastin, asporin, CHD3, tenascin, THBS4, EDIL3 (Table S2 in Additional file 1). EDIL3 and ASPN were selected because two were top 2 over-expressed DEGs. The remaining 4 proteins were selected from the CMT-related network modules (ELN, CHD3, TNC, and THBS4). The primary antibodies were detected using the UltraVision LP Detection System (Thermo Fisher Scientific, Fremont, CA). The reaction products were developed with the Vector NovaRED® substrate kit for peroxidase (Vector Laboratories, Burlingame, CA) for 5 minutes, and hematoxylin counterstaining was then applied. The IHC staining was assessed in the CMT and normal SCM. The intensity of IHC staining was scored semi-qauntitatively as follows (0-3): 0, no immune-expression; 1, weak immune-expression; 2, moderate immune-expression; 3, marked immune-expression. The Mann-Whitney test was used to test significance of the intensity of IHC staining between the CMT and normal SCM. P values less than 0.05 were considered statistically.