Microarray analysis of peripheral blood lymphocytes from ALS patients and the SAFE detection of the KEGG ALS pathway

Background Sporadic amyotrophic lateral sclerosis (sALS) is a motor neuron disease with poorly understood etiology. Results of gene expression profiling studies of whole blood from ALS patients have not been validated and are difficult to relate to ALS pathogenesis because gene expression profiles depend on the relative abundance of the different cell types present in whole blood. We conducted microarray analyses using Agilent Human Whole Genome 4 × 44k Arrays on a more homogeneous cell population, namely purified peripheral blood lymphocytes (PBLs), from ALS patients and healthy controls to identify molecular signatures possibly relevant to ALS pathogenesis. Methods Differentially expressed genes were determined by LIMMA (Linear Models for MicroArray) and SAM (Significance Analysis of Microarrays) analyses. The SAFE (Significance Analysis of Function and Expression) procedure was used to identify molecular pathway perturbations. Proteasome inhibition assays were conducted on cultured peripheral blood mononuclear cells (PBMCs) from ALS patients to confirm alteration of the Ubiquitin/Proteasome System (UPS). Results For the first time, using SAFE in a global gene ontology analysis (gene set size 5-100), we show significant perturbation of the KEGG (Kyoto Encyclopedia of Genes and Genomes) ALS pathway of motor neuron degeneration in PBLs from ALS patients. This was the only KEGG disease pathway significantly upregulated among 25, and contributing genes, including SOD1, represented 54% of the encoded proteins or protein complexes of the KEGG ALS pathway. Further SAFE analysis, including gene set sizes >100, showed that only neurodegenerative diseases (4 out of 34 disease pathways) including ALS were significantly upregulated. Changes in UBR2 expression correlated inversely with time since onset of disease and directly with ALSFRS-R, implying that UBR2 was increased early in the course of ALS. Cultured PBMCs from ALS patients accumulated more ubiquitinated proteins than PBMCs from healthy controls in a serum-dependent manner confirming changes in this pathway. Conclusions Our study indicates that PBLs from sALS patients are strong responders to systemic signals or local signals acquired by cell trafficking, representing changes in gene expression similar to those present in brain and spinal cord of sALS patients. PBLs may provide a useful means to study ALS pathogenesis.


Background
Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease causing muscle weakness and wasting resulting from the loss of motor neurons in brain and spinal cord characterized by ubiquitinated inclusions in brain and spinal cord of post mortem ALS patients [1]. Several genome-wide association studies (GWAS) have shown evidence of genetic heterogeneity underlying disease susceptibility [2]. Single nucleotide polymorphisms were found in the ITPR2 (inositol 1,4,5triphosphate receptor, type 2) [3], FGGY (FGGY carbohydrate kinase domain containing) [4], DPP6 (dipeptidyl-peptidase 6) [5], with variable strength of association with ALS and limited replication. None of these genes has been proven relevant to the pathogenesis of ALS.
More recently, mutations were found in the UNC13A (unc-13 homolog A) gene [6] and in the 9p21 chomosomal locus [7]. To overcome challenges in the interpretation of results from GWAS and data from the world of "omics" in general, ALS researchers are actively engaged in integrative global bioinfomatics and the creation of ALS models for development of new ALS therapies (Euro-MOTOR project) [8]. Despite genetic heterogeneity underlying disease susceptibility, the clinical manifestations of the ALS phenotype are relatively homogeneous; suggesting that at the cellular and molecular levels there may be a convergence of a limited number of pathways that could lead to the ALS phenotype.
Gene expression profiling studies using microarrays and/or real time quantitative RT-PCR have been conducted on various tissues from rodent models for ALS such as muscle or brain tissues, lumbar spinal anterior horn tissues, spinal cord motor neurons isolated by laser capture microdissection (LCM), whole blood or peripheral blood mononuclear cells (PBMCs). Similar studies were performed on spinal cord tissues or LCM-isolated motor neurons obtained post mortem from ALS patients. Roughly,~1000 unique genes were found differentially expressed but only~5% differentially expressed in the same direction in more than one study [9], indicating little reproducibility. Poor reproducibility may be due to the use of different gene expression profiling methods or platforms, tissue of different origin, methods used for biological sample preparation, time of tissue collection at pre-symptomatic or symptomatic stage, and use of a particular batch of rodents or human cohort. Rather, one may find greater commonalities at the pathway alteration level with regard to apoptosis regulation, calcium regulation, oxidative stress and mitochondrial function, ERstress and unfolded protein response (UPR), UPS and autophagy, RNA processing, DNA metabolism, axonal transport, integrity of the neuromuscular junction, muscle atrophy, and direct/indirect interactions with astrocytes, microglia and T-cells. Within these biological processes, genes of importance are those with mutations or polymorphisms shown to confer susceptibility to or cause ALS; or genes playing a critical role in the pathways that involve susceptibility genes.
A number of studies have sought blood biomarkers that may be useful to detect early signs of ALS, assess disease progression, monitor treatment effects, or track down the cause(s) of the disease, in a minimally-invasive fashion in ALS patients. Using qRT-PCR, Lin et al. (2009) have shown subtle transcriptional down-regulation of mitochondrial electron-transfer chain genes in whole blood from ALS patients [10]. Saris et al. (2009) have identified co-expressed gene modules (clusters) in total blood from sporadic ALS (sALS) patients [11].
These findings resulted from subtle differential expression of 2300 probe-encoded genes and were related to biological/disease categories such as post-translational modification, infection mechanism, inflammatory disease, neurological disorder, and skeletal and muscular disorder. Gagliardi et al. (2010) showed increased SOD1 mRNA expression in spinal cord, brain stem and lymphocytes of sporadic ALS (sALS) patients [12]. Zhang et al. (2011) identified gene expression profiles of shortterm cultured PBMCs from ALS patients, demonstrating the activation of monocytes/macrophages via the LPS/ TLR4 neuroinflammatory pathway [13]. Lincecum et al. (2010) demonstrated the activation in ALS pathogenesis of a co-stimulatory pathway bridging the activation of T-cell responses and the amplification of the innate immune response, based on gene expression profiles obtained from whole blood of the G93A SOD1 mouse model and ALS patients [14]. Circulating white blood cells might acquire certain properties from long distance signals mediated by small metabolites or macromolecules circulating in peripheral blood. They might also acquire novel properties from trafficking at sites of neurodegeneration associated with rupture of the blood brain barrier or blood-spinal cord barrier in early and late ALS to a variable degree. Further investigation in this area of ALS research is critically needed [15].
In the current work, we analyzed RNA extracted from PBLs of ALS patients and control subjects, thereby reducing some of the complexity of mixed expression patterns generated by RNA from reticulocytes, granulocytes, monocytes, thrombocytes and plasma, normally present in whole blood. Indeed, gene expression profiles of blood-derived samples are strongly dependent on the predominant constituent cell type(s) [16,17]. Analyses of mRNA expression data by LIMMA [18], SAM [19] and SAFE [20], revealed alterations of the ubiquitin/proteasome system (UPS). Using proteasome inhibition assays, parallel changes of UPS activity at the protein level were determined in subcultured PBMCs (mainly composed of lymphocytes) from ALS patients, by Western blot analysis.

Isolation of peripheral blood lymphocytes from ALS patients and controls
During year 2007 until March 2008, blood samples to be used for microarray analysis were collected at Carolinas Neuromuscular/ALS-MDA Center with approval by the IRB at Carolinas Medical Center. Informed consent was obtained from all participants to this study. ALS diagnosis was determined according to the El Escorial Criteria for "definite" ALS after exclusion of other conditions [21]. Disease onset was defined as time of initial weakness, dysarthria or dysphagia. Blood samples (~18 mL) were drawn from sporadic definite ALS patients and healthy control (HC) subjects by venipuncture into tubes adequate for either serum or lymphocyte isolation. The healthy controls (HCs) consisted of 9 white females (mean age 51.4 ± 11 (standard deviation) years) and 2 white males (64, 65). The sALS patients consisted of one black male (49), one black female (69), 5 white females (mean age 59 ± 20 years), and 4 white males (mean age 47 ± 9 years). Table 1 presents the clinical characteristics of the enrolled patients and healthy controls subjected to microarray analysis. PBMCs were isolated using Histopa-que™-1077 density gradient centrifugation method. Using this procedure, yields were generally 1-2 × 10 6 PBMCs per mL of blood. Lymphocytes were further enriched to over 90% purity from the PBMC fraction by subsequent PERCOLL gradient centrifugation [22]. Blood samples were processed immediately upon reception in the lab within 30 minutes after blood draw.

RNA extraction, amplification, and dual mode reference design microarrays
The common reference design [23] was used for sample assignment in the dual color mode of expression assay on the Agilent Human Whole Genome 4 × 44k Microarrays to analyze~40000 transcripts. Microarray experiments were performed, in which each of the 22 RNA samples (HC and sALS) was co-hybridized with RNA from the HC reference pool that was constituted with equal amounts of each of the 11 RNA samples from healthy controls. Total RNA stored in TRIzol (Invitrogen) at -80°C , was extracted from the lymphocyte samples at Cogenics, Inc. (Morrisville, NC) by standard procedures. The quantity of each of the total RNA samples and determination of the A 260/280 nm ratio was determined by spectrophotometry and the size distribution was assessed using an Agilent Bioanalyzer. Fifty nanograms of total RNA was converted into labelled cRNA with nucleotides coupled to a fluorescent dye (either Cy3 or Cy5) using the Quick Amp Kit (Agilent Technologies, Palo Alto, CA) following the manufacturer's protocol. The A 260/280 nm ratio and yield of each of the cRNAs were determined and a quality assessment was done using an Agilent Bioanalyzer. Equal amounts of Cy3 and Cy5-labeled cRNA (825 ng) from two different samples were hybridized to Agilent Human Whole Genome 4 × 44k Microarrays. The hybridized array was washed and scanned and data were extracted from the scanned image using Feature Extraction version 10.2 (Agilent Technologies). The nonnormalized and normalized microarray datasets have been deposited in the NCBI Gene Expression Omnibus [24] as series GSE28253.

Statistical analyses and SAFE data mining
Raw data .txt files in Agilent format were converted to . MEV files using ExpressConverter™ v2.1 of the TM4 Microarray Suite (TIGR Genomics, Rockville, CA). Background-subtracted raw data were normalized using the MIDAS pipeline (TM4, TIGR Genomics, Rockville, MD) according to Sioson et al. (2006) with the following steps: total intensity normalization, LocFit (LOWESS), standard deviation regularization and low intensity trim [25]. Filtering stringencies requiring that the integrated signal intensities (ISI) for each Cy3 and Cy5 channels were more than two standard deviation(s) of the Cy3 and Cy5 background (ISI = 7000), generated the dataset DS7000 [7199 probes, 5540 unique genes]. DS7000 was subjected to LIMMA [18] and SAM [19] analyses using TMeV v4.5.1 program (TM4, TIGR Genomics, Rockville, MD) to determine differentially expressed genes. The false discovery rate (FDR) was 1.17% (delta = 0.90) for DS7000. SAFE analysis was performed with Bioconductor 2.5 according to Barry, Nobel, and Wright (2005) [20] to identify gene sets demonstrating different expression levels between classes of comparison. Default settings for local (t-test) and global (Wilcoxon) statistics were used. Comparisons were based on gene ontology databases for biological processes, molecular functions, cellular components, and the protein families (Pfam) and KEGG databases.

Total ubiquitination and proteasome inhibition assays with PBMCs from ALS patients and healthy controls
Freshly isolated PBMCs (composed of~80% lymphocytes and~20% of monocytes per flow cytometry Membranes were subsequently incubated with goat antimouse human absorbed HRP-secondary sc-2055 antibody (1/10000 dilution; Santa Cruz) for 30 minutes and assayed using the Super Signal Pico chemiluminescence detection system (Thermo Fisher Scientific). Subsequent reprobing with anti-beta-actin antibody sc-81178 (Santa Cruz) was performed by stripping membranes of bound antibodies in stripping buffer (62.5 mM Tris HCL, 2% SDS, and 100 mM 2-mercaptoethanol [pH 6.7]) at 56°C for 20 minutes. ECL films and a LAS3000 imaging system (Fuji) were used for detection of the chemiluminescence. Silver staining was used to confirm loading homogeneity in the PAGEs post-electrotransfer using SilverSNAP stain (Thermo Fisher Scientific), in addition to reprobing of the membranes for beta-actin.

Semi-quantitative analysis of the Western blot data
Raw images were processed in ImageJ program (Dr. Wayne Rasband, wayne@codon.nih.gov, National Institute of Mental Health, Bethesda, Maryland, USA). The accumulated HMW ubiquitinated protein forms were delineated by a rectangular area, for which the background subtracted integrated density could be measured. The integrated density could then be measured for same area below the accumulated forms at a level of the blotting membrane demonstrating consistency of staining throughout the lanes, thereby providing a contrast reference area per lane. Calculation of a signal-tonoise (S/N) ratio for the accumulated forms was then determined independently from the detection of betaactin that was achieved by stripping and reprobing the Western blot membranes.

Results
We studied gene expression profiles of lymphocytes isolated from 11 patients diagnosed with definite sporadic ALS (sALS) and 11 healthy control subjects. Clinical characteristics for this cohort are described in Table 1. Figure 1 summarizes results from microarray data normalization and LIMMA, SAM and SAFE analyses.

LIMMA and SAM analyses
Differentially expressed genes between ALS patients and healthy controls were determined using LIMMA (p LIMMA < 0.001, q LIMMA ≤ 10%) and SAM (q SAM ≤ 1%) in TM4/TMeV v4.5.1 program [25] for the dataset DS7000 ( Figure 1). A significant overlap was found by comparing LIMMA and SAM results ( Figure 1, Additional File 1). Table 2 presents the 24 most differentially expressed genes (q SAM = 0%, p LIMMA < 0.0005, q LIMMA ≤ 10%). Five genes (C12orf35, DYNLT1, IRS2, SKIV2L2, and TARDBP) were significant at high stringency (q SAM = 0%, p LIMMA < 0.001, q LIMMA < 5%). C12orf35, DYNLT1, SKIV2L2, and TARDBP were upregulated 1.5-1.8 fold change (FC), while IRS2 was downregulated by two fold. DYNLT1 (dynein, light chain, Tctex-type 1) encodes a component of the dynein-dynactin complex composed of dynactin (DCTN1), those mutations have been involved in ALS [26], while TARDBP encodes TAR DNA binding protein 43 (TDP-43), those mutations may cause ALS [27]. TIA-1, a marker of stress granules that colocalizes with TDP-43 inclusions in frontotemporal lobar degeneration (FTLD-U) and ALS [28], is also upregulated (FC = 1.4, q SAM = 0%, p LIMMA < 0.005, q LIMMA~1 0%; Additional File 1. SKIV2L2 encodes a DEAD-box RNA helicase which is part of the exosome and spliceosome complexes [29,30] and it is known, for example, that some DEAD-box RNA helicases interact with FUS/TLS to control pre-mRNA splicing [31]. Defects or deregulation in RNA processing are a hallmark in the pathogenesis of motor neuron diseases, since motor neurons may be uniquely sensitive to perturbations in RNA processing pathways [32]. We also note that IL7R (interleukin-7 receptor subunit alpha), polymorphisms of which have been associated with risk in multiple sclerosis [33], is upregulated (FC = 1.7, q SAM Figure 1 Normalization, filtering, SAFE, LIMMA and SAM analyses of lymphocyte-derived microarray data from ALS patients and healthy controls. Microarray analyses were performed on purified PBLs isolated from patients affected by sporadic amyotrophic lateral sclerosis (sALS) (n = 11) and healthy control subjects (HCs) (n = 11). The dual color mode in the common reference design was used to interrogate the expression of~40000 transcripts (~30000 unique genes) using Agilent Human Whole Genome 4 × 44k Microarrays. Raw expression data were normalized and filtered using the MIDAS pipeline in the TM4 microarray suite (TIGR Genomics, Rockville, MD) to generate the dataset DS7000. SAFE was used for testing enrichment of functional gene ontology (GO) categories related to biological processes, molecular functions, cellular components, protein families (Pfam) and the KEGG databases. DS7000 was subjected to LIMMA and SAM analyses using TM4/TMeV v4.5.1 to determine differentially expressed (DE) genes. The online tool Data Overlapping and Area-Proportional Venn Diagram (http://bioinforx.com/free/ bxarrays/overlap.php) was used to generate the Venn diagram.

SAFE identification of molecular signatures in lymphocytes from ALS patients
SAFE [20] is a resampling-based procedure that is similar to GSEA (Gene Set Enrichment Analysis) [35], but with more flexible choices of test statistics. SAFE was used to obtain information on unifying biological themes from databases specific for (i) gene ontology (GO) pathways/categories (biological process, cellular component and molecular function), (ii) pathways/categories defined by the KEGG (Kyoto Encyclopedia of Genes and Genomes) and (iii) Pfam (protein families).
Such resampling procedures have been shown to provide more accurate control of false positives than simpler enrichment-test methods using only lists of pvalues [36]. Following determination of local (t-test) and global (Wilcoxon test) statistics using SAFE default settings, the significance for each gene set category was determined by bootstrap re-sampling and multiple test correction (for the multiple categories examined) by an FDR procedure with q SAFE < 25% considered significant (similarly to GSEA). This relatively liberal threshold was intended to avoid false negatives, although many of the findings presented here achieve more striking significance. For SAFE gene ontology category analysis, gene sets of 5-100 genes were examined, similar to restrictions used by others (e.g., Barry et al., 2005) [20]. This approach ensures that gene sets were not so small as to call into question a "pathway" interpretation, and not so large as to defy biological interpretation. In addition, the approach helps to manage the multiple testing penalties across numerous categories. To simplify overall The list presents the 24 most discriminatory genes distinguishing the definite sALS patients (n = 11) from the healthy control subjects (n = 11). α Agilent Array 4 × 44K probe ID, β gene symbol, γ description, δ LIMMA significance p value, ε LIMMA FDR (q value), and χ fold change (FC) in expression are indicated. Using the TM4-MIDAS/TMeV pipeline, all genes had a local FDR of q = 0 according to SAM performed on DS7000 [7199 probes, 5540 unique genes]. Five genes had a LIMMA q value ≤0.05: DYNLT1, IRS2, TARDBP, C12orf35, and SKIV2L2.
interpretation, we only reported "upregulated" categories to highlight pathway activations caused by the disease rather than pathway inhibitions. Significant upregulated categories representing gene sets associated with "biological processes", "cellular components", and "KEGG pathways" are shown in Tables 3,   4 and 5. We used same gene set size restriction (5-100 genes) for SAFE analysis of KEGG gene ontology groups including the KEGG Human Disease pathways, of which 25 were annotated in Bioconductor 2.5 for the Agilent platform. We identified that the KEGG ALS pathway was significant (q SAFE = 18%). The KEGG ALS pathway   [37] by including gene set sizes >100 in a secondary analysis.
We found that only neurodegenerative disease pathways (4 in total) were significantly upregulated (Additional File 2). In this latter analysis, ALS was less significant than the other three neurodegenerative diseases (Huntington's disease, Parkinson's disease, Alzheimer's disease), suggesting that neurodegeneration affects lymphocytes to a greater extent than ALS-specific biological processes. Nevertheless, such interpretation has to be taken with caution. Indeed, some genes represented on the pathway maps of these other three neurodegenerative diseases are related to the UPS, cytoskeleton or dynein-dynactin complex, and therefore should be represented on the KEGG ALS pathway. However, ALS, Alzheimer's, Huntington's and Parkinson's are all neurodegenerative diseases related to aging and/or associated with mitochondrial dysfunction. For this reason, results are in alignment with the results of Saris et al. (2009) [11]. Prion disease, generally thought to be less related to the other neurodegenerative disorders, was found not significant (Additional File 2).
A total of 54 unique gene IDs (including the pseudogene caspase 12) constitute the KEGG ALS pathway (hsa05014) [37] and correspond to 36 protein entities defining unique proteins or protein complexes ( Figure  2). A total of 35 protein entities corresponding to 53 genes represented on the KEGG ALS pathway map (pseudogene CASP12 excluded) include membrane receptors, cytosolic or secreted proteins, kinases, phosphatases, proteases, and protein channels, which are likely to play a direct/indirect role in ALS pathogenesis to a variable degree at different stages that lead to motor neuron degeneration. Some protein entities may correspond to different isoforms represented by unique gene IDs. For example, calcineurin (CaN entity) may be composed by three catalytic isoforms (α,β,γ) encoded by three different chromosomes and many types of glutamate receptors may represent the GluR entity ( Figure 2).
There were 23 unique genes (43%, 81 probes), the aggregate expression pattern of which contributes to the perturbation of the KEGG ALS pathway ( Table 6). The number of protein entities defined by these genes and represented on the KEGG ALS pathway map was 19 out of 35 (54%) (Table 6, Figure 2). The dynamics of upand down-regulations, assuming they are functionally effective, may be interpreted as responses to signals originating from serum or cell-cell interactions. For example upregulation of ASK1 (alias MAP3K5) may be associated with an ER-stress response that correlates with ALS progression (Figure 2). Also, assuming that transcriptional regulation produces more or less active protein with appropriate subcellular localization and in a timely manner, about half of the genes would have a negative effect on motor neuron survival while the rest would have a positive effect according to current ALS literature. This said contribution of aggregate expression to a pathway does not necessarily signify differential expression of each participant gene in terms of differences between the mean expression levels in lymphocytes from ALS vs. healthy controls. One limitation is that individual protein entities of the UPS, the dyneindynactin complex, and TARDBP/TDP43 pathway and other elements of ALS pathogenesis are not represented on the KEGG ALS pathway, so that other potential aggregate effects relevant of ALS pathogenesis cannot be determined using the current gene set. However, pathway perturbations were determined by SAFE for genes belonging to the UPS both in terms of "biological process" and "cellular components" affected (Tables 3  and 4). Gene ontology categories corresponding to gene sets related to the UPS (AmiGO database) [38] included the following: positive or negative regulation of ubiquitin-protein ligase, proteasomal ubiquitin-dependent proteins [GO:0051443, 6 proteins; GO:0051444, 4 proteins; GO:0043161, 43 proteins] (Table 3); chaperonin-containing T-complex, ubiquitin ligase complex, nuclear    (Figure 1, Additional File 1), nine are related to the UPS, including four E3 ubiquitin ligases (RNF149, TRIM22, UBR1, and UBR2) ( Table 7). Also, ANAPC4, SHFM1, SUGT1, UBR1 and UBR2 were represented by the UPS GO groups found significant by SAFE. Genes (53 in total, pseudogene CASP12 excluded) belonging to the KEGG ALS pathway (hsa05014) that describe pathogenic effects in motor neurons are defined by their official HUGO symbol. α Entrez Gene accession number and β HUGO symbol and γ description are provided. δ Genes that contribute to the KEGG ALS pathway through SAFE analysis are marked with a (+) sign and if not with a (-) sign. Number of probes significant (PS) per total number of probe signal intensity values per gene on the Agilent 4 × 44K array (PT) is shown. λ FC is average fold change ALS (n = 11) compared to healthy controls (n = 11) considering one or more probe signal intensity values per gene, with upregulated genes indicated by (↑) and downregulated genes by (↓). Less than 1.05 fold changes are indicated by (<). PMID referenced positive (+) or negative (-) effect on motor neuron survival if wild type protein activity or function is increased (↑) or normal function or activity is altered (↓). Genes that were not determined (n.d.) to contribute to the KEGG ALS pathway or with an FC < 1.05, are also referenced for known effects. τ Lymphocyte response (LR) represented by the genes contributing to the KEGG ALS pathway is shown.  A total of nine genes among 206 related to the ubiquitin/proteasome system (UPS) were significantly upregulated in lymphocytes from ALS patients compared to controls, as determined by SAM (q < 1%) and LIMMA analyses (p < 0.001). Among them, four genes encode E3 ubiquitin ligases (RNF149, TRIM22, UBR1, and UBR2) and one gene a deubiquitinase (YOD1). α Agilent Array 4 × 44K probe IDs, β gene symbol, γ NCBI GenBank accession number, δ fold change in expression and ε GeneCard functional description (http://www.genecards.org) are provided.

Assessment of alteration of UPS-related gene expression in lymphocytes from ALS patients based on microarray data
Correlation of ANAPC4, SHFM, SUGT1, UBR1 and UBR2 with demographic and disease parameters was determined. Among these five genes, UBR2 (Ubiquitinprotein ligase E3-alpha-2) encoded protein is known to act in conjunction with UBR1 in a quality control pathway for degradation of unfolded cytosolic proteins [39]. We calculated Spearman correlation between expression data and length of the disease from symptom onset and the ALS Functional Rating Scale-Revised score (ALSFRS-R) at the time of peripheral blood sampling. Significant correlation was found between UBR2 increased gene expression and time of disease from onset to time of lymphocyte sampling (r = -0.8091, p = 0.0039), as well as ALSFRS-R (r = 0.6333, p = 0.0402) ( Figure 3, Table 8). Similar to Saris et al. (2009) [11], we found no correlation between the expression of these genes with gender, age at onset, age at collection, and site of onset. However, unlike Saris et al. (2009) [11] but similar to Zhang et al. (2006) [13] we present correlations of individual genes with disease duration and ALSFRS-R.

Assessment of alteration of the UPS in PBMCs from ALS patients using proteasome inhibition assays
We employed the MG132 proteasome inhibition assay to test whether the UPS transcriptional alterations described above are accompanied by ubiquitination changes at the protein level. MG132 blocks the proteolytic activity of the 26S proteasome complex reversibly, which inhibits the degradation of ubiquitin-conjugated proteins and has multiple effects including, for instance, reducing muscle atrophy associated with disuse [40] or increasing caspase-mediated generation of TDP-43 Cterminal fragments [41]. We prepared peripheral blood mononuclear cell (PBMC) short-term cultures from ALS patients (n = 6) and healthy control subjects (n = 5). High molecular weight (HMW) poly-ubiquitinated protein forms were detected in protein lysates of these PBMCs by Western blot analysis using monoclonal antiubiquitin antibody similarly to Jury et al. (2003) [42]. For PBMCs from healthy control subjects, cultured in RPMI [10% FCS] medium, accumulation of HMW polyubiquitinated proteins was induced by MG132 treatment, but this accumulation was partially mitigated by the supplementation of the RPMI [10% FCS] medium with matched autologous human serum at a final concentration of 20% ( Figure 4). For PBMCs from ALS patients, cultured in RPMI [10% FCS] medium, accumulation of HMW poly-ubiquinated proteins was induced by MG132 treatment, and this accumulation was further increased by addition of autologous human serum from each ALS patient (Figure 4).

Discussion
We report, for the first time, genome-wide expression profiling of purified lymphocytes from patients with amyotrophic lateral sclerosis. This study, performed with the long oligonucleotide Agilent Human Whole Genome 44 × 4K Array, demonstrates that ALS relevant differential gene expression and pathway perturbations can be identified in peripheral blood lymphocytes by a functional enrichment method such as SAFE [20] and not only in brain or spinal cord that are directly affected by the disease. In the search for blood biomarkers in neurological disorders, determination of molecular signatures or pathway alterations becomes critical in the analysis of microarray data generated from the blood compartment. This is due to the fact that at the genome-wide scale of gene expression, relevant biological differences may be modest or even negligible relative to the noise. The expression profiling studies on whole blood from ALS patients by Saris et al.  [13] which could be due to subculture conditions and/or the method chosen for normalization of the microarray data [43,44]. In our study, fold changes in expression for the genes found significant by SAM and LIMMA (i.e. DS3500), varied from 1.244 to 3.422 (mean value ± SD = 1.556 ± 0.28). Therefore, one may not expect to correlate differential expression by qRT-PCR for many genes due to the large sample size required to eventually confirm small changes in expression. This problem is partially circumvented by global pathway analysis methods. Many differentially expressed genes identified by SAM and LIMMA may be subjectively placed in the context of ALS pathogenesis. In addition, there was little overlap with 166 genes that were found associated with ALS to a variable degree in several single-gene and genomewide association studies (GWAS). For instance, TARDBP, SOD1, KIFAP3 and COX7C were differentially expressed in our study. TARDBP and SOD1, clearly associated with ALS pathogenesis, have also been  showed that SOD1 mRNA levels were increased in spinal cord, brain stem, and lymphocytes of sporadic ALS patients, but did not correlate with gender, age or duration of the disease [12]. For the first time, gene expression data from the blood compartment from sporadic ALS patients could be associated with the KEGG ALS disease pathway and KEGG disease pathways of neurodegenerative disorders such as Alzheimer's, Parkinson's and Huntington's diseases. Considering that global genome-wide subtle changes in gene expression were used for this determination, this result is rather unexpected. Protein activity changes that are caused by the presence of the disease are generally not expected to consistently correspond to transcriptional regulations. Our use of purified lymphocytes has likely provided a better dataset to study ALS-specific signature in the blood compartment as opposed to total blood. However, because of the small sample size of our study (n = 22) and because ALS is a heterogeneous disease, it is not possible to capture the breadth of the disease process occurring during onset and progression of the disease. In addition, disease responses in lymphocytes may not mirror many of the disease processes occurring in brain, which depend on the alteration of the blood brain barrier and the microenvironment represented by glia and microglia. Furthermore, assuming that transcriptional regulation produces more or less active protein with appropriate subcellular localization and in a timely manner, about half of the genes based on their expression would have a negative effect on motor neuron survival while the rest would have a positive effect according to current ALS literature (Table 6). This clearly indicates very limited replication of processes occurring in brain or spinal cord of ALS patients. Thus, while similar pathways are affected in motor neurons and lymphocytes due to a possible systemic common cause(s), it is expected that some responses may differ in their details possibly reflecting differential susceptibility.
In our pathway analysis of the dataset DS7000 generated with Agilent Human Whole Genome 4 × 44K Array, SAFE identified alteration of gene expression pertaining to gene ontology (GO) categories relevant to ALS pathogenesis (and/or other neurological diseases), such as DNA metabolism, RNA splicing, mitochondrial function, oxidation, ER and Golgi functions, UPS, neurological function, post-translational modification and viral infection. These results are consistent with findings by Saris et al. (2009) that were determined by whole blood RNA profiling [11]. However, following pathway analysis using SAFE, we went further in the analysis of transcriptional alterations of the UPS by identifying a correlation between the expression of differentially expressed individual UPS-related genes and the time of presence of the disease or the ALSFRS-R. Indeed, whole exome sequencing identified mutations in the gene encoding valosin-containing protein (VCP), a key component of the UPS, as a cause of familial ALS, demonstrating that disturbances of UPS function may be closely linked to ALS pathogenesis [45]. A total of nine differentially expressed genes, were related to the UPS including four ubiquitin ligases representative of UPS GO groups identified by SAFE (ANAPC4, SHFM1, UBR1, and UBR2). Differential expression of the "N-end rule" ubiquitin ligase UBR2 gene [46] in lymphocytes from ALS patients was found to correlate with disease duration and ALSFRS-R at the time of sampling. Although, overall UBR2 mRNA expression is upregulated in ALS patients compared to healthy controls, a decrease in expression correlated with more advanced stage or severity. This apparent paradox can be explained by the possibility that an initial disease process to which healthy controls are never exposed, causes an initial upregulation of UBR2 mRNA expression which then declines as the disease progresses with increasing impairment of the UPS machinery. One possible mechanism of action of E3 ubiquitin ligases UBR1 and UBR2 could be to facilitate targeting of foldable conformers to the proteasome [39] and to provide protection against toxicity of (unknown) misfolded proteins that accumulate during the disease course in lymphocytes from ALS patients. This mechanism is similar to the E3 ubiquitin ligase dorfin (encoded by RNF19A) that prevents mutant SOD1-mediated neurotoxicity and improves symptoms in the transgenic G93A SOD1  Figure 4 Total ubiquitination Western blot (WB) analysis of cultured PBMCs from ALS patients and controls in the presence or absence of added-back serum for 16 hours and treated or not with proteasome inhibitor MG132 for 1.5 hr. Comparison of PBMCs from one healthy control and one ALS patient incubated or not in the presence of added-back matched autologous serum is shown in (a). Semiquantitative Western blot analysis was performed to measure the accumulation of high molecular weight (HMW) ubiquitinated protein species in PBMCs that were prepared the same day from one healthy control and one ALS patient (WB1). A signal (S) to noise (N) ratio (S/N) was determined with ImageJ program by comparing the integrated density of two areas consistently stained throughout the membrane and visually contrasting the accumulation of HMW ubiquitinated protein species (WB1). ALS patient serum exacerbates the effects of MG132 on total ubiquitination and accumulation of HMW ubiquitinated species, while serum from healthy control mitigates these effects. Comparison of PBMCs from ALS patients (n = 5) and healthy controls (n = 4) incubated in the presence of added-back matched autologous serum is shown in (b). PBMCs obtained at different times from additional ALS patients (n = 5) and healthy controls (n = 4) show similar result (WB2 and WB3).
mouse model [47,48]. Indeed, the presence of some cellular toxicity in PBMCs was shown by De Marco et al.
(2010) [49] who determined that the cytoplasmic fraction of TDP-43 in circulating PBMCs of sporadic and familial ALS patients was increased. In addition, by analogy with mutant SOD1-mediated toxicity, human wildtype TDP-43-mediated neurotoxicity might be partially alleviated by co-expression with ubiquilin 1 (encoded by UBQLN1) involved in autophagy and proteasome targeting [50,51]. Moreover, mutations in ubiquilin 2 (encoded by UBQLN2) have been associated with Xlinked juvenile ALS and adult sporadic ALS [52]. Ubiquilins bind to both ubiquitin ligases and the proteasome, providing a connector function within the UPS [53].
Our proteasome inhibition assays also indicate that lymphocytes from ALS patients exposed to serum factors and metabolites in vivo have acquired new properties with regard to the UPS and other pathways that are normally perturbed in degenerating motor neurons. In this respect, the study by Watanabe et al. (2010) [54], showing that metabolic alterations of the UPS may take place in the skin of ALS patients, follows the same paradigm. In addition, using short-term PBMC cultures Zhang et al. (2011) [13] showed that monocytes in ALS patients have acquired unique properties that relate to neuroinflammation and innate immunity.

Conclusions
Our approach demonstrates that subtle changes in gene expression measured by Agilent Human Whole Genome 4 × 44K Array may be interpreted objectively. Without underestimating the complexity of ALS pathogenesis, our analyses with these arrays identify multiple new directions worth further investigation, including systemic UPS pathway alterations, in the search of biomarkers associated with the cause(s) or the progression of ALS. Overall, it remains to be determined which properties the circulating lymphocytes acquire by long distance signaling in the peripheral blood system, and which properties they acquire by local signaling or local cellcell contact due to trafficking of the lymphocytes at the sites of neurodegeneration in brain or spinal cord.

Additional material
Additional file 1: SAM analysis (q < 1%) and LIMMA analysis conducted independently using DS7000. Probe set IDs, gene symbol, GenBank NCBI accession number, fold change (FC) ALS vs. healthy controls, local FDR (q value in %), LIMMA significance p value and q value, and log 2 ratios of normalized expression data and gene descriptions are shown. FC >1 signifies higher expression in the ALS group.
Additional file 2: SAFE results of the KEGG disease pathways. Raw SAFE data are presented for the 25 disease pathways (gene set size 5-100) and 34 disease pathways (including gene sets with >100 genes) analyzed using the DS7000 microarray dataset. These pathways represent cancer, circulatory, genetic, immune, neurological and urological diseases.