Intrinsic molecular stratification of human colorectal cancer. Unsupervised analysis and hierarchical clustering of global gene expression data derived from colorectal cancer cases identified 2 major "intrinsic" subclasses (cyan and magenta) distinguished by the first principal component (PC1) of the most variable genes. These two key native subtypes were clearly identified in both the (a) Moffitt Cancer Center (MCC) data set (n = 326) and the (b) EXPO dataset (n = 269). PC1 was later found to be tightly correlated with an EMT signature derived from cell lines, providing an explanation for the biology underpinning these two intrinsic classes in both datasets. PC1 clearly distinguishes two subclasses which were subsequently identified as epithelial vs. mesenchymal. On both panels (a) and (b), mean-centered probe intensities are shown, and probes are clustered using Pearson correlation based distance and Ward linkage. Also, rows represent samples, and columns represent array probes. Panel (c) shows scatter plot of EMT signature score and PC1 (First Principal Component Score) on Moffitt Cancer Center data set. Panel (D) shows the scatter plot between probe intensities for Vimentin (VIM) and E-cadherin probes in a panel of 93 Lung Cancer Cell Lines. Cell Lines exhibiting epithelial-like phenotype are shown in green; those exhibiting mesenchymal-like phenotype are shown in red.