Skip to main content
  • Research article
  • Open access
  • Published:

Discriminating lymphomas and reactive lymphadenopathy in lymph node biopsies by gene expression profiling



Diagnostic accuracy of lymphoma, a heterogeneous cancer, is essential for patient management. Several ancillary tests including immunophenotyping, and sometimes cytogenetics and PCR are required to aid histological diagnosis. In this proof of principle study, gene expression microarray was evaluated as a single platform test in the differential diagnosis of common lymphoma subtypes and reactive lymphadenopathy (RL) in lymph node biopsies.


116 lymph node biopsies diagnosed as RL, classical Hodgkin lymphoma (cHL), diffuse large B cell lymphoma (DLBCL) or follicular lymphoma (FL) were assayed by mRNA microarray. Three supervised classification strategies (global multi-class, local binary-class and global binary-class classifications) using diagonal linear discriminant analysis was performed on training sets of array data and the classification error rates calculated by leave one out cross-validation. The independent error rate was then evaluated by testing the identified gene classifiers on an independent (test) set of array data.


The binary classifications provided prediction accuracies, between a subtype of interest and the remaining samples, of 88.5%, 82.8%, 82.8% and 80.0% for FL, cHL, DLBCL, and RL respectively. Identified gene classifiers include LIM domain only-2 (LMO2), Chemokine (C-C motif) ligand 22 (CCL22) and Cyclin-dependent kinase inhibitor-3 (CDK3) specifically for FL, cHL and DLBCL subtypes respectively.


This study highlights the ability of gene expression profiling to distinguish lymphoma from reactive conditions and classify the major subtypes of lymphoma in a diagnostic setting. A cost-effective single platform "mini-chip" assay could, in principle, be developed to aid the quick diagnosis of lymph node biopsies with the potential to incorporate other pathological entities into such an assay.

Peer Review reports


The increasing complexity of lymphoma diagnosis and classification is the result of a vastly improved understanding of its underlying molecular pathogenesis and resultant attempts to group subtypes of lymphoma in a clinically and biologically meaningful manner. From a practical perspective, this increasing complexity places great demands on the pathologist. Accurate diagnosis and classification of lymphoma in lymph node biopsies requires pathological evaluation utilising morphological analysis of an acceptable biopsy specimen, together with a series of supplementary tests including immunophenotyping by immunohistochemistry and flow cytometry, and increasingly cytogenetics, FISH and PCR data [1]. Such ancillary tests can be costly and time consuming, requiring specialised technicians and analytical experience from multiple divisions of a pathology laboratory. Of benefit would be a cost-effective, single platform ancillary test that provides a rapid standardised diagnosis of lymphoma and recognition of major subtypes, allowing more selective use of other ancillary tests during subsequent assessment by the pathologist.

Genome-wide gene expression profiling (GEP) is a novel approach to disease classification based on the molecular biology of the disease. This 'genetic fingerprint' data thus allows the identification and classification of individual tissue samples according to their distinct gene expression profiles. There is a significant body of research employing GEP in lymphoma, having been used for a number of purposes including distinguishing closely related lymphoma phenotypes such as diffuse large B-Cell lymphoma (DLBCL) versus Burkitt lymphoma [2] and DLBCL from primary mediastinal B-cell lymphoma [3], to trace malignant lymphoma phenotypes to "normal cell of origin" [4] and to identify expression profiles linked to tumour prognosis [5]. However, there is a lack of publications that address the potential value of gene expression microarray in aiding the routine diagnosis and classification of lymphoma in tissue biopsies from individual cases suspected of lymphoma. The feasibility of GEP as a tool to classify tumour tissue has been examined for other cancers, such as breast [6], colon [7], prostate [8] and renal tumours [9].

Distinct from previous microarray studies of lymphoma, this single institute study evaluated whether gene expression microarray as a single platform could be used to distinguish three major subtypes of lymphoma and non-malignant reactive lymphadenopathy (RL) in individual lymph node samples. To our knowledge, this study is a first attempt to apply such a strategy to lymph node specimens across different subtype diagnoses in a diagnostic setting. Our heterogeneous study set enabled the identification of gene signatures that are likely an accurate representation of each diagnostic type, given that this was determined by comparing each diagnostic type against the remaining cases by binary classification approach.


Patient samples

Lymph node specimens from patients undergoing biopsy for suspected lymphoma were identified by the Department of Pathology at St Vincent's Hospital, Sydney. Samples were collected subject to written consent for this human research ethics approved study (H00/028/1). A portion of the fresh biopsy specimen (≥2 mm3) was collected in RNAlater solution (Ambion, Foster City, CA) and then cryopreserved in liquid nitrogen. A further five biopsy samples (two cHL, two DLBCL, one FL) were obtained from the Newcastle Mater Misericordiae Hospital. The diagnosis and classification of each specimen was made or reviewed by an expert haematopathologist (JT) practising at our centre [10], according to the WHO Classification of Tumours of Haematopoietic and Lymphoid tissue 2001 [11]. All samples were collected during 2001-07 and analysed prior to the 2008 update of this classification system. Most cases (n = 16) of DLBCL were subclassified into germinal centre B (GCB) and non-GCB cell types (n = 5 and n=11 respectively) by immunohistochemistry according to the Hans algorithm [12]. The 23 cases of RL included reactive hyperplasia (n = 16), reactive hyperplasia with progressive germinal centre transformation (n = 3) or granuloma related to toxoplasmosis (n = 1), dermatopathic lymphadenitis (n = 1) and normal (n = 2).

RNA and Microarray assays

Frozen lymph node tissue was homogenised in TRIzol reagent (Invitrogen, Victoria, Australia) using a pellet pestle and total RNA isolated using RNeasy micro-column purification (Qiagen, Doncaster, Australia). The integrity of total RNA was assessed by denaturing agarose gel electrophoresis (1% agarose, 221 mM formaldehyde, 20 mM MOPS, 5 mM sodium acetate, 1 mM EDTA). Only samples with distinct 28 S and 18 S RNA bands were assayed by microarray. The total RNA extracted from granulocyte colony-stimulating factor mobilised peripheral blood stem cells (PBSC) samples from 10 healthy individuals (collected with informed consent) were pooled and used as the reference RNA for microarray assays. Two-colour fluorescent probe synthesis was carried out on 2 μg of patient (Cy5 - Red) and reference (Cy3 - Green) RNA and then competitively hybridised to a microarray using the 3DNA 900MPX kit (Genisphere, Hatfield, PA) according to the manufacturer's protocol. A GenePix 4000A scanner and GenePix Pro 3.0 image analysis software (Molecular devices, Sunnyvale, CA) was used to capture microarray images and quantify fluorescent signals from each feature. The microarrays used in this study were printed by the Adelaide Microarray Facility (Adelaide, Australia) using the Compugen library of 19000 70-mer oligonucleotides, which covers over 12000 generic human genes.

Preprocessing of array data

Data from GenePix result files were pre-processed by within-array print-tip Lowess normalisation. The quality of each array was assessed prior to analysis to ensure only arrays of sufficient quality were retained. A quality score was obtained using the QC CV scoring from arrayQuality [13]. The microarray data with clinical information have been deposited in NCBI's Gene Expression Omnibus (GEO, and are accessible through GEO Series accession number GSE23647. This study is comprised from two batches of arrays developed over two years. The batches were analysed and normalised separately to maintain the independence of the two datasets.

GEP classification analysis

The statistical analysis was performed using the R statistical software version 2.8.1. Microarray gene expression data from 81 patients of batch-1 arrays (training set) were used to develop a diagnostic profile. An independent series consisting of a further 35 patients (batch-2 arrays - test set) were used to independently assess the classification accuracy of the profile. The patient information, diagnosis and number of samples used in the training and test dataset of this study are indicated in Table 1.

Table 1 Summary of the biopsies in each disease category examined by microarray

The ability of GEP to ascertain the correct diagnosis of each biopsy was assessed via diagonal linear discriminant analysis (DLDA) with classification error rates in the training set determined by leave one out cross-validation (LOOCV). The ratio of between sum of squares to within sum of squares (bss/wss) criteria was used for feature selection performed within each CV fold. As a measure of discriminative power in two-class classification, the selective use of features ranked high in bss/wss enriches for potential biomarkers of interest. The top ranked bss/wss genes ranging from 10-500 (increments of 10) were assessed within the classification development to identify the number of genes required to obtain a minimal (optimal) cross-validation error rate (see Additional file 1). The classification power of the determined optimal set of genes was then tested on the independent test set sample. Firstly, the results for each classification built from training datasets are expressed in terms of a classification accuracy rate (%), which represents the similarity between the pathological clinical diagnosis and the microarray diagnosis [14]. The accuracy rate of training datasets was determined by subtracting the LOOCV-error rate (%) from 100%. A separate dataset was then used to obtain independent error and accuracy rates. A DLDA classification rule was constructed from the complete training set data using the optimal number of genes estimated via the LOOCV stage of the analysis. This classification rule was then used to classify the independent data. The independent test dataset accuracy rate denotes the percentage of samples in the test dataset that have been correctly diagnosed using the molecular signatures identified from classification of the training dataset.

To examine the feasibility of GEP to classify RL versus lymphoma and subtypes of lymphoma, the expression data was analysed according to three different approaches:

  1. 1.

    A global (all data) multi-class strategy was performed to classify the four main classes examined in this study (RL, cHL, FL and DLBCL) in a single step.

  2. 2.

    A series of independent local (selected data) pair-wise (binary-class) comparisons of the four main classes examined was made. Comparisons included lymphoma versus RL, cHL versus NHL (inclusive of 3 and 17 cases respectively of rare T- and B cell lymphoma subtypes), and lastly, FL versus DLBCL, the two most prevalent forms of NHL in Caucasian populations [10]. These will be refered to as the 'local binary comparisons'

  3. 3.

    A number of 'global binary comparisons' were performed by pair-wise comparisons of samples from an individual subtype versus the remaining data.

For all classification strategies, both LOOCV and independent test set accuracy rates were determined as mentioned above.

Heat maps

Heat maps of the set of genes (classifiers) yielding the optimal LOOCV accuracy rates were produced. Clustering was performed for both samples and genes utilising hierarchical clustering with Euclidean distance as the dissimilarity function with complete linkage agglomeration. Dendrograms are displayed on the appropriate axis of the heat map.


Microarray analysis of lymph node biopsies

Microarrays quantifying gene expression in lymph node biopsies (n = 142) suspected of lymphoma were assessed for quality using the arrayQuality package. Arrays with a CV QC score of >1 were identified as sub-optimal in quality and subsequently removed from analysis to result in a final total of 116 arrays analysed. A summary of the diagnosis of each of these biopsies and number of each subtype examined is shown in Table 1.

A diagnostic profile for the global multi-class approach consisting of 50 genes was developed, and demonstrated an optimal LOOCV accuracy rate of 83.6% for the diagnosis of biopsies as RL, HL, FL or DLBCL (Table 2). Most cases from the diagnostic classes examined in this study clustered distinctly together by hierarchical clustering, with the exception of DLBCL (Figure 1A). An accuracy rate of only 68.6% was achieved when the same profile of genes were tested on an independent test set (Table 2).

Table 2 The accuracy rates resulting from GEP classification of lymph node biopsies into selected subtypes.
Figure 1
figure 1

Hierarchical clustering of lymph node samples comparing distinct subtypes of lymphoma. Heatmaps depict A. Multi-class analysis for cases of RL (grey, n = 16), cHL (black, n = 12), DLBCL (yellow, n = 8) and FL (magenta, n = 25); B. cHL (grey, n = 12) versus NHL (black, n = 53) and C. FL (grey, n = 25) versus DLBCL (black, n = 8). The columns represent the samples and rows represent the solicited genes. Each cell within the grid is indicative of the gene expression level for an individual sample with colour used to depict intensity on a graduating red (high) to green (low) scale. See Additional files 2 and 3 for the lists of top 10 and 20 classifier genes distinguishing HL from NHL and FL from DLBCL.

Using local binary classification (lymphoma versus RL, cHL versus NHL, and FL v DLBCL groups; Table 2) LOOCV accuracy rates at each independent comparison ranged from 84.8-89.2%. This high rate of accuracy was reflected in hierarchical clustering analysis, which showed only 5 samples (1 HL and 4 NHL) clustered incorrectly in the NHL versus HL comparison (Figure 1B), and a clear-cut separation between FL and DLBCL samples achieved using only 10 unique classifying genes (Figure 1C). Independent test set accuracy rates for these comparisons were all greater than 76% (Table 2). See Additional files 2 and 3 for the lists of top 10 and 20 classifier genes distinguishing HL from NHL and FL from DLBCL.

Given the higher independent test set accuracy rates when only two defined types of diagnoses (classes) were considered in each analysis, we next investigated if we can diagnose a specific disease type compared to the remainder of all case types examined in this study. The gene classifiers, identified from this type of comparison are likely more specific to the disease type itself, as opposed to signature genes identified purely from a comparison of two subtypes. Samples were classified by comparing each subtype of lymphoma or RL to all remaining cases at the global level and accuracy rates for this type of binary classification determined. Based on varying optimal number of genes, LOOCV accuracy rates of training sets were all above 82%, with test set accuracy rates of 88.5%, 82.8%, 82.8% and 80.0% for FL, cHL, DLBCL and RL respectively.

Identified gene classifiers of RL, cHL, DLBCL and FL

Strong classifiers of reactive node tissue included the lower expression of a cohort of immune-response related genes compared to tissue diagnosed with lymphoma (Table 3). Reduced expression in reactive node tissue was also observed for genes such as TATA box binding protein (TBP)-associated factor 140 kDa (TAF3) and Lim domain binding 2 (LDB2). Molecular classifiers identified for cases of cHL include the expression of a selection of chemokine ligands, and the transcription factor STAT1 (Table 4). The gene LIM domain only-2 (LMO2) was highly differentially expressed in FL compared to the remainder of cases examined (Table 5) whilst high expression of cyclin-dependent kinase inhibitor 3 (CDKN3) transcripts was associated with DLBCL (Table 6). A total of 31 genes within any of the four lists of genes identified from binary-class analyses are common to the 38 annotatable genes (of 50 probes) identified from our multi-class analysis. See Additional file 4 for the complete list of classifiers (genes) for all classifications. The clustering of these identified classifiers is not influenced by clinical covariates such as the age or gender (Additional file 5).

Table 3 Top 20 annotated classifier genes of RL samples.
Table 4 The top 20 annotated classifier genes of cHL.
Table 5 The top 20 annotated classifier genes of FL.
Table 6 The top 20 annotated classifier genes of DLBCL.


In the present study, we used GEP microarrays to analyse 116 lymph node biopsies to assess the feasibility of this technology as a diagnostic tool in a clinical setting. This study is preceded by a significant body of research on GEP of lymphoma that has focused on understanding the pathogenesis of individual subtypes of lymphoma and refining the diagnosis and prognosis of these subtypes. However, our aim was to examine the practical question of whether GEP could be used to classify lymph node samples into the major subtypes of lymphoma and also to distinguish them from reactive lymph nodes.

The ability of GEP to diagnose biopsies of reactive, cHL, DLBCL and FL origin was examined with three strategies: global multi-class classification; local binary-class and global binary-class classification. The global multi-class approach classified each sample into one of the four diagnostic types with limited accuracy, which is known to decrease when more than two classes are considered simultaneously in linear classification algorithms [15]. Our binary comparisons, which compared a particular diagnostic type with either another type (local) or with the remainder of all cases (global), resulted in high (>80%) accuracy rates for independent test sets, except when comparing FL to DLBCL (76.1%), the subtype that was most frequently misclassified. This limitation of GEP in classifying DLBCL may be related to the high degree of heterogeneity of the disease itself. Distinct molecular forms of DLBCL have been identified in other GEP studies [4, 16, 17], although this does not readily explain the misclassified cases of this study, which included both GCB and non-GCB DLBCL as judged by the Hans algorithm for immunohistochemistry. As the partial involvement of a tissue biopsy by lymphoma cannot be excluded, sampling error may also contribute to classification error rates. In regards to the comparison of RL with lymphoma, the two RL samples misclassified were both reactive hyperplasia. It should be noted that our reactive nodes were unselected and as such not all of which would necessarily have been B-cell predominant reactions. Therefore the random sampling of reactive nodes, which have different compartments, may contribute to sampling error. The accuracy to distinguish benign from malignant may be improved by increasing the number of cases used to build the classification, especially since there is an imbalance in the number of reactive biopsies (23) compared to the number of cancerous cases (93).

Application of our findings to clinical practice would require a much larger scale study to not only verify our identified genetic signature of particular types but also to assess the profile of uncommon lymphoma subtypes. We nonetheless feel that this work represents an important step in testing the principle of using GEP, based on simple and inexpensive arrays, as a diagnostic ancillary test for lymph node biopsy. We found that our laboratory practices were easily adapted to allow routine allocation of a portion of biopsy specimen for microarray as routine tests such as flow cytometry and cytogenetics, for diagnosis of lymphoma, also require fresh specimen (not formalin fixed). The development of new techniques such as quantitative nuclease protection assays on formalin-fixed, paraffin-embedded tissue blocks would overcome any difficulty in obtaining fresh tissue for microarray gene expression profiling and make GEP much more widely available even in small biopsies [18].

The 18% technical exclusion rate of samples arrayed in this study hampers the diagnostic utility of microarray. However, increased familiarity with the assay will reduce the exclusion rate, and in laboratories with a limited caseload, referral to a centralised service may be preferable. Given the substantial improvement of microarray technology since the initiation of this study, the use of newer genome-wide microarray platforms such as Illumina bead arrays would also improve the utility of this technology and contribute to reducing the technical exclusion rate seen in this study. Incorporation of microRNA array data [19] may also be appropriate, especially given the reported stability of microRNA expression [20].

In our study, 13 of the 40 classifier genes identified from a specific (local) comparison of cHL with NHL were also strong classifiers when cHL was globally compared to both NHL and reactive samples. This indicates that our classification strategy encompasses unique gene sets that can classify across more than two types of pathological conditions. Although some gene classifiers identified in our study were common to other reported GEP studies, the absence of some previously identified key classifiers may be due to variable probe make-up across different microarray platforms or resulting from differences in the type of diagnostic classes used in our classification compared to most published GEP studies [21]. Our distinct global binary comparisons would have likely identified gene signatures that represent the particular diagnostic type in question as it was compared to a mixture of lymphoma subtypes and non-cancerous samples.

The fact that high expression of CD7, CCL17 and STAT1 has been reported to be associated with cHL supports the reliability of our microarray data presented in this study [2224]. As Hodgkin and Reed-Sternberg cells only account for on average 1% of the mixed cell types present in HL infiltrates, it is likely that the expression of some of the HL classifiers are derived from the stromal cell population. This should not influence the applicability of lymph node GEP to the diagnosis of HL given that this stromal reaction is likely to be similar across different HL samples and that their gene expression profiles have been reported to predict the outcome of HL [25]. Similarly for FL, our detected reduced expression of CD163, a macrophage marker, may reflect a low number of macrophages present in the node microenvironment in many cases of FL. The importance of this information is not diminished as increased reactive macrophages in a rare subset of FL have been reported to be associated with poorer survival [26]. LMO2, another strong molecular classifier identified for FL, has been reported to be expressed in approximately 50% of FL [27]. However, it is better known as a key gene expressed in GCB cell type of DLBCL [4] and as a strong predictor of superior outcome in DLBCL [28]. Given the importance of LMO2 expression in DLBCL, its absence in our list of top 20 classifying genes of DLBCL may be due to the fact that only 5 cases examined (26%) are of GCB cell origin by immunohistochemistry. Instead, we have identified the gene cyclin D kinase inhibitor 3 (CDKN3), a known marker of the ABC-like DLBCL [29], to be expressed higher in our DLBCL samples compared to the other diagnostic types examined in this study.

The lower expression of several immunoglobulin genes in reactive node tissue may reflect the differences in the cellular makeup of the microenvironment of normal lymph node tissue compared to those diseased with lymphoma. Consistent with the phenotype of non-cancerous tissue, we detected reduced expressions of a potentially cancerous gene TAF3, a negative regulator of the tumour suppressor p53 [30].


This proof of principle study has shown that microarray as a single platform assay can achieve reasonable diagnostic accuracy with the ability to differentiate lymphoma from non-cancerous reactive lymphadenopathy, and also classify three common subtypes of lymphoma. Molecular classifiers identified to be characteristic of each subtype examined could be combined onto a cost effective custom 'mini' microarray to screen lymph node biopsies for expression profiles to assist the diagnosis of four common outcomes: reactive, cHL, FL or DLBCL in a relatively quick and inexpensive manner. Indeed the use of GEP as a diagnostic and prediction tool for other diseases is currently commercially available. One example includes the CupPrint and MammaPrint array service provided by Agendia for patients with breast cancer. The increased sensitivity in RNA extraction techniques has made GEP much more widely available even for paraffin sections, small fine needle aspiration and core biopsies, which are routinely used in a pathology lab. The application of GEP may streamline current diagnostic tests by allowing more selective use of costly and time consuming ancillary tests such as immunohistochemistry, immunophenotyping by flow cytometry, cytogenetics, FISH and PCR in every case of lymphoma. Its use may provide an objective diagnostic test that could be standardised across pathology laboratories.


  1. Swerdlow SH, Campo E, Harris NL, Jaffe ES, Pileri SA, Stein H, Thiele J, Vardiman JW, (Eds): WHO Classification of Tumours of Haematopoietic and Lymphoid tissue. 2008, Lyon: IARC

  2. Dave SS, Fu K, Wright GW, Lam LT, Kluin P, Boerma EJ, Greiner TC, Weisenburger DD, Rosenwald A, Ott G, et al: Molecular diagnosis of Burkitt's lymphoma. N Engl J Med. 2006, 354: 2431-2442. 10.1056/NEJMoa055759.

    Article  CAS  PubMed  Google Scholar 

  3. Savage KJ, Monti S, Kutok JL, Cattoretti G, Neuberg D, De Leval L, Kurtin P, Dal Cin P, Ladd C, Feuerhake F, et al: The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma. Blood. 2003, 102: 3871-3879. 10.1182/blood-2003-06-1841.

    Article  CAS  PubMed  Google Scholar 

  4. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, et al: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000, 403: 503-511. 10.1038/35000501.

    Article  CAS  PubMed  Google Scholar 

  5. Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM, et al: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med. 2002, 346: 1937-1947. 10.1056/NEJMoa012914.

    Article  PubMed  Google Scholar 

  6. van't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415: 530-536.

    Article  Google Scholar 

  7. Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999, 96: 6745-6750. 10.1073/pnas.96.12.6745.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Welsh JB, Sapinoso LM, Su AI, Kern SG, Wang-Rodriguez J, Moskaluk CA, Frierson HF, Hampton GM: Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res. 2001, 61: 5974-5978.

    CAS  PubMed  Google Scholar 

  9. Schuetz AN, Yin-Goen Q, Amin MB, Moreno CS, Cohen C, Hornsby CD, Yang WL, Petros JA, Issa MM, Pattaras JG, et al: Molecular classification of renal tumors by gene expression profiling. J Mol Diagn. 2005, 7: 206-218. 10.1016/S1525-1578(10)60547-8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Turner JJ, Hughes AM, Kricker A, Milliken S, Grulich A, Kaldor J, Armstrong B: Use of the WHO lymphoma classification in a population-based epidemiological study. Ann Oncol. 2004, 15: 631-637. 10.1093/annonc/mdh140.

    Article  CAS  PubMed  Google Scholar 

  11. Jaffe ES, Harris NL, Stein H, Vardiman JW, (Eds): World Health Organization Classification of Tumours. Pathology and Genetics of Tumours of Haematopoietic and Lymphoid Tissues. 2001, Lyon: IARC Press

  12. Hans CP, Weisenburger DD, Greiner TC, Gascoyne RD, Delabie J, Ott G, Muller-Hermelink HK, Campo E, Braziel RM, Jaffe ES, et al: Confirmation of the molecular classification of diffuse large B-cell lymphoma by immunohistochemistry using a tissue microarray. Blood. 2004, 103: 275-282. 10.1182/blood-2003-05-1545.

    Article  CAS  PubMed  Google Scholar 

  13. Yang YH, Pacquet AC: Preprocessing Two-Colour Spotted Arrays. Bioinformatics and Computational Biology Solutions using R and Bioconductor. Edited by: Gentleman R, Carey V, Dudoit S, Irizarry RA, Huber W. 2005, New York: Sprinter, 49-70. full_text.

    Chapter  Google Scholar 

  14. Belov L, Mulligan SP, Barber N, Woolfson A, Scott M, Stoner K, Chrisp JS, Sewell WA, Bradstock KF, Bendall L, et al: Analysis of human leukaemias and lymphomas using extensive immunophenotypes from an antibody microarray. Br J Haematol. 2006, 135: 184-197. 10.1111/j.1365-2141.2006.06266.x.

    Article  CAS  PubMed  Google Scholar 

  15. Somorjai RL, Dolenko B, Baumgartner R: Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions. Bioinformatics. 2003, 19: 1484-1491. 10.1093/bioinformatics/btg182.

    Article  CAS  PubMed  Google Scholar 

  16. Monti S, Savage KJ, Kutok JL, Feuerhake F, Kurtin P, Mihm M, Wu B, Pasqualucci L, Neuberg D, Aguiar RC, et al: Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood. 2005, 105: 1851-1861. 10.1182/blood-2004-07-2947.

    Article  CAS  PubMed  Google Scholar 

  17. Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, et al: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. 2002, 8: 68-74. 10.1038/nm0102-68.

    Article  CAS  PubMed  Google Scholar 

  18. Rimsza LM, Leblanc ML, Unger JM, Miller TP, Grogan TM, Persky DO, Martel RR, Sabalos CM, Seligmann B, Braziel RM, et al: Gene expression predicts overall survival in paraffin-embedded tissues of diffuse large B-cell lymphoma treated with R-CHOP. Blood. 2008, 112: 3425-3433. 10.1182/blood-2008-02-137372.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Lawrie CH, Chi J, Taylor S, Tramonti D, Ballabio E, Palazzo S, Saunders NJ, Pezzella F, Boultwood J, Wainscoat JS, Hatton CS: Expression of microRNAs in diffuse large B cell lymphoma is associated with immunophenotype, survival and transformation from follicular lymphoma. J Cell Mol Med. 2009, 13: 1248-1260. 10.1111/j.1582-4934.2008.00628.x.

    Article  CAS  PubMed  Google Scholar 

  20. Jung M, Schaefer A, Steiner I, Kempkensteffen C, Stephan C, Erbersdobler A, Jung K: Robust MicroRNA Stability in Degraded RNA Preparations from Human Tissue and Cell Samples. Clin Chem. 2010, 56: 10.1373/clinchem.2009.141580.

    Google Scholar 

  21. Ein-Dor L, Kela I, Getz G, Givol D, Domany E: Outcome signature genes in breast cancer: is there a unique set?. Bioinformatics. 2005, 21: 171-178. 10.1093/bioinformatics/bth469.

    Article  CAS  PubMed  Google Scholar 

  22. Seegmiller AC, Karandikar NJ, Kroft SH, McKenna RW, Xu Y: Overexpression of CD7 in classical Hodgkin lymphoma-infiltrating T lymphocytes. Cytometry B Clin Cytom. 2009, 76: 169-174.

    Article  PubMed  Google Scholar 

  23. Niens M, Visser L, Nolte IM, van der Steege G, Diepstra A, Cordano P, Jarrett RF, Te Meerman GJ, Poppema S, van den Berg A: Serum chemokine levels in Hodgkin lymphoma patients: highly increased levels of CCL17 and CCL22. Br J Haematol. 2008, 140: 527-536. 10.1111/j.1365-2141.2007.06964.x.

    Article  CAS  PubMed  Google Scholar 

  24. Schmitz R, Stanelle J, Hansmann ML, Kuppers R: Pathogenesis of classical and lymphocyte-predominant Hodgkin lymphoma. Annu Rev Pathol. 2009, 4: 151-174. 10.1146/annurev.pathol.4.110807.092209.

    Article  CAS  PubMed  Google Scholar 

  25. Sanchez-Aguilera A, Montalban C, de la Cueva P, Sanchez-Verde L, Morente MM, Garcia-Cosio M, Garcia-Larana J, Bellas C, Provencio M, Romagosa V, et al: Tumor microenvironment and mitotic checkpoint are key factors in the outcome of classic Hodgkin lymphoma. Blood. 2006, 108: 662-668. 10.1182/blood-2005-12-5125.

    Article  CAS  PubMed  Google Scholar 

  26. Farinha P, Masoudi H, Skinnider BF, Shumansky K, Spinelli JJ, Gill K, Klasa R, Voss N, Connors JM, Gascoyne RD: Analysis of multiple biomarkers shows that lymphoma-associated macrophage (LAM) content is an independent predictor of survival in follicular lymphoma (FL). Blood. 2005, 106: 2169-2174. 10.1182/blood-2005-04-1565.

    Article  CAS  PubMed  Google Scholar 

  27. Natkunam Y, Zhao S, Mason DY, Chen J, Taidi B, Jones M, Hammer AS, Hamilton Dutoit S, Lossos IS, Levy R: The oncoprotein LMO2 is expressed in normal germinal-center B cells and in human B-cell lymphomas. Blood. 2007, 109: 1636-1642. 10.1182/blood-2006-08-039024.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Lossos IS, Czerwinski DK, Alizadeh AA, Wechser MA, Tibshirani R, Botstein D, Levy R: Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N Engl J Med. 2004, 350: 1828-1837. 10.1056/NEJMoa032520.

    Article  CAS  PubMed  Google Scholar 

  29. Blenk S, Engelmann J, Weniger M, Schultz J, Dittrich M, Rosenwald A, Muller-Hermelink HK, Muller T, Dandekar T: Germinal Center B Cell-Like (GCB) and Activated B Cell-Like (ABC) Type of Diffuse Large B Cell Lymphoma (DLBCL): Analysis of Molecular Predictors, Signatures, Cell Cycle State and Patient Survival. Cancer Inform. 2007, 3: 399-420.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Bereczki O, Ujfaludi Z, Pardi N, Nagy Z, Tora L, Boros IM, Balint E: TATA binding protein associated factor 3 (TAF3) interacts with p53 and inhibits its function. BMC Mol Biol. 2008, 9: 57-10.1186/1471-2199-9-57.

    Article  PubMed  PubMed Central  Google Scholar 

Pre-publication history

Download references


We thank Dr Lisa Lincz from the Hunter Haematology Research Group (Newcastle Mater Misericordiae Hospital) for providing some tissue samples included in this study. We also thank Dr. Sam Milliken and Dr. Michael Buckland for their insightful comments regarding the manuscript. This study was supported in part by Australian grants from The Sydney Foundation for Medical Research (DM), St Vincent's Hospital Haematology Research Fund, the Arrow Bone Marrow Transplant Foundation (ML and TJM), the National Health and Medical Research Council postgraduate award (AB) and the Australian Postgraduate Award (THL and AC).

Author information

Authors and Affiliations


Corresponding author

Correspondence to David DF Ma.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

THL, TJM, and ML carried out the microarray experiments and interpretation of results. AC and JYHY carried out the computational and statistical analysis. AB participated in the histopathology review and interpretation of data. JT provided expert histopathology review and interpretation of data. DDFM participated in study design, guidance in analysis and interpretation of data. All authors have contributed to and approved the manuscript.

Electronic supplementary material


Additional file 1: Determination of optimal error rates. a pdf file describing how the optimal error rates for each classification was obtained. (PDF 33 KB)

Additional file 2: Classifier genes that distinguish cHL from NHL. a pdf file containing one table (PDF 82 KB)

Additional file 3: Classifier genes that distinguish FL from DLBCL. a pdf file containing one table. (PDF 77 KB)

Additional file 4: Complete list of all classifier genes. an excel file containing 7 spreadsheets. (XLS 130 KB)


Additional file 5: Dendrogram analysis of clinical covariates. A pdf file showing that the classifiers are not biased towards or against clinical covariates using dendrogram analysis. (PDF 379 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Loi, T.H., Campain, A., Bryant, A. et al. Discriminating lymphomas and reactive lymphadenopathy in lymph node biopsies by gene expression profiling. BMC Med Genomics 4, 27 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: