LNCaP Atlas: Gene expression associated with in vivo progression to castration-recurrent prostate cancer

Background There is no cure for castration-recurrent prostate cancer (CRPC) and the mechanisms underlying this stage of the disease are unknown. Methods We analyzed the transcriptome of human LNCaP prostate cancer cells as they progress to CRPC in vivo using replicate LongSAGE libraries. We refer to these libraries as the LNCaP atlas and compared these gene expression profiles with current suggested models of CRPC. Results Three million tags were sequenced using in vivo samples at various stages of hormonal progression to reveal 96 novel genes differentially expressed in CRPC. Thirty-one genes encode proteins that are either secreted or are located at the plasma membrane, 21 genes changed levels of expression in response to androgen, and 8 genes have enriched expression in the prostate. Expression of 26, 6, 12, and 15 genes have previously been linked to prostate cancer, Gleason grade, progression, and metastasis, respectively. Expression profiles of genes in CRPC support a role for the transcriptional activity of the androgen receptor (CCNH, CUEDC2, FLNA, PSMA7), steroid synthesis and metabolism (DHCR24, DHRS7, ELOVL5, HSD17B4, OPRK1), neuroendocrine (ENO2, MAOA, OPRK1, S100A10, TRPM8), and proliferation (GAS5, GNB2L1, MT-ND3, NKX3-1, PCGEM1, PTGFR, STEAP1, TMEM30A), but neither supported nor discounted a role for cell survival genes. Conclusions The in vivo gene expression atlas for LNCaP was sequenced and support a role for the androgen receptor in CRPC.


Background
Systemic androgen-deprivation therapy by orchiectomy or agonists of gonadotropic releasing hormone are routinely used to treat men with metastatic prostate cancer to reduce tumor burden and pain. This therapy is based on the dependency of prostate cells for androgens to grow and survive. The inability of androgen-deprivation therapy to completely and effectively eliminate all metastatic prostate cancer cell populations is manifested by a predictable and inevitable relapse, referred to as castration-recurrent prostate cancer (CRPC). CRPC is the end stage of the disease and fatal to the patient within 16-18 months of onset.
The mechanisms underlying progression to CRPC are unknown. However, there are several models to explain its development. One such model indicates the involvement of the androgen signaling pathway [1][2][3][4]. Key to this pathway is the androgen receptor (AR) which is a steroid hormone receptor and transcription factor. Mechanisms of progression to CRPC that involve or utilize the androgen signaling pathway include: hypersensitivity due to AR gene amplification [5,6]; changes in AR co-regulators such as nuclear receptor coactivators (NCOA1 and NCOA2) [7,8]; intraprostatic de novo synthesis of androgen [9] or metabolism of AR ligands from residual adrenal androgens [10,11]; AR promiscuity of ligand specificity due to mutations [12]; and ligandindependent activation of AR by growth factors [protein kinase A (PKA), interleukin 6 (IL6), and epidermal growth factor (EGF)] [13][14][15]. Activation of the AR can be determined by assaying for the expression of target genes such as prostate-specific antigen (PSA) [16]. Other models of CRPC include the neuroendocrine differentiation [17], the stem cell model [18] and the imbalance between cell growth and cell death [3]. It is conceivable that these models may not mutual exclusive. For example altered AR activity may impact cell survival and proliferation.
Here, we describe long serial analysis of gene expression (LongSAGE) libraries [19,20] made from RNA sampled from biological replicates of the in vivo LNCaP Hollow Fiber model of prostate cancer as it progresses to the castration-recurrent stage. Gene expression signatures that were consistent among the replicate libraries were applied to the current models of CRPC.

In vivo LNCaP Hollow Fiber model
The LNCaP Hollow Fiber model of prostate cancer was performed as described previously [21][22][23]. All animal experiments were performed according to a protocol approved by the Committee on Animal Care of the University of British Columbia. Serum PSA levels were determined by enzymatic immunoassay kit (Abbott Laboratories, Abbott Park, IL, USA). Fibers were removed on three separate occasions representing different stages of hormonal progression that were androgen-sensitive (AS), responsive to androgen-deprivation (RAD), and castration-recurrent (CR). Samples were retrieved immediately prior to castration (AS), as well as 10 (RAD) and 72 days (CR) post-surgical castration.
RNA sample generation, processing, and quality control Total RNA was isolated immediately from cells harvested from the in vivo Hollow Fiber model using TRIZOL Reagent (Invitrogen) following the manufacturer's instructions. Genomic DNA was removed from RNA samples with DNaseI (Invitrogen). RNA quality and quantity were assessed by the Agilent 2100 Bioanalyzer (Agilent Technologies, Mississauga, ON, Canada) and RNA 6000 Nano LabChip kit (Caliper Technologies, Hopkinton, MA, USA).

LongSAGE library production and sequencing
RNA from the hollow fibers of three mice (biological replicates) representing different stages of prostate cancer progression (AS, RAD, and CR) were used to make a total of nine LongSAGE libraries. LongSAGE libraries were constructed and sequenced at the Genome Sciences Centre, British Columbia Cancer Agency. Five micrograms of starting total RNA was used in conjunction with the Invitrogen I-SAGE Long kit and protocol with alterations [24]. Raw LongSAGE data are available at Gene Expression Omnibus [25] as series accession number GSE18402. Individual sample accession numbers are as follows: S1885, GSM458902; S1886, GSM458903; S1887, GSM458904; S1888, GSM458905; S1889, GSM458906; S1890, GSM458907; S1891, GSM458908; S1892, GSM458909; and S1893, GSM458910.

Gene expression analysis
LongSAGE expression data was analyzed with Disco-verySpace 4.01 software [26]. Sequence data were filtered for bad tags (tags with one N-base call) and linker-derived tags (artifact tags). Only LongSAGE tags with a sequence quality factor (QF) greater than 95% were included in analysis. The phylogenetic tree was constructed with a distance metric of 1-r (where "r" equals the Pearson correlation coefficient). Correlations were computed (including tag counts of zero) using the Regress program of the Stat package written by Ron Perlman, and the tree was optimized using the Fitch program [27] in the Phylip package [28]. Graphics were produced from the tree files using the program Tree-View [29]. Tag clustering analysis was performed using the Poisson distribution-based K-means clustering algorithm. The K-means algorithm clusters tags based on count into 'K' partitions, with the minimum intracluster variance. PoissonC was developed specifically for the analysis of SAGE data [30]. The java implementation of the algorithm was kindly provided by Dr. Li Cai (Rutgers University, NJ, USA). An optimal value for K (K = 10) was determined [31].

Principle component analysis
Principle component analysis was performed using GeneSpring™ software version 7.2 (Silicon Genetics, CA). Affymetrix datasets of clinical prostate cancer and normal tissue were downloaded from Gene Expression Omnibus [25] (accession numbers: GDS1439 and GDS1390) and analyzed in GeneSpring™. Of the 96 novel CR-associated genes, 76 genes had corresponding Affymetrix probe sets. These probe sets were applied as the gene signature in this analysis. Principle component (PC) scores were calculated according to the standard correlation between each condition vector and each principle component vector.

Results
LongSAGE library and tag clustering RNA isolated from the LNCaP Hollow Fiber model was obtained from at least three different mice (13N, 15N, and 13R; biological replicates) at three stages of cancer progression that were androgen-sensitive (AS), responsive to androgen-deprivation (RAD), and castrationrecurrent (CR). To confirm that the samples represented unique disease-states, we determined the levels of KLK3 mRNA, a biomarker that correlates with progression, using quantitative real time-polymerase chain reaction (qRT-PCR). As expected, KLK3 mRNA levels dropped in the stage of cancer progression that was RAD versus AS (58%, 49%, and 37%), and rose in the stage of cancer progression that was CR versus RAD (229%, 349%, and 264%) for mice 13R, 15N, and 13N, respectively (Additional file 1). Therefore, we constructed nine LongSAGE libraries, one for each stage and replicate.
LongSAGE libraries were sequenced to 310,072 -339,864 tags each, with a combined total of 2,931,124 tags, and filtered to leave only useful tags for analysis (Table 1). First, bad tags were removed because they contain at least one N-base call in the LongSAGE tag sequence. The sequencing of the LongSAGE libraries was base called using PHRED software. Tag sequencequality factor (QF) and probability was calculated to ascertain which tags contain erroneous base-calls. The second line of filtering removed LongSAGE tags with probabilities less than 0.95 (QF < 95%). Linkers were introduced into SAGE libraries as known sequences utilized to amplify ditags prior to concatenation. At a low frequency, linkers ligate to themselves creating linkerderived tags (LDTs). These LDTs do not represent transcripts and were removed from the LongSAGE libraries. A total of 2,305,589 useful tags represented by 263,197 tag types remained after filtering. Data analysis was carried out on this filtered data.
The LongSAGE libraries were hierarchically clustered and displayed as a phylogenetic tree. In most cases, LongSAGE libraries made from the same disease stage (AS, RAD, or CR) clustered together more closely than LongSAGE libraries made from the same biological replicate (mice 13N, 15N, or 13R; Figure 1). This suggests the captured transcriptomes were representative of disease stage with minimal influence from biological variation.
Identification of groups of genes that behave similarly during progression of prostate cancer was conducted through K-means clustering of tags using the PoissonC algorithm [30]. For each biological replicate (mice 13N, 15N, or 13R), all tag types were clustered that had a combined count greater than ten in the three libraries representing disease stages (AS, RAD, and CR) and mapped unambiguously sense to a transcript in reference sequence (RefSeq; February 28 th , 2008) [32] using DiscoverySpace4 software [33]. By plotting within cluster dispersion (i.e., intracluster variance) against a range of K (number of clusters; Additional file 1, Figure S2), we determined that ten clusters best embodied the expression patterns present in each biological replicate. This was decided based on the inflection point in the graph (Additional file 1, Figure S2), showing that after reaching K = 10, increasing the number of K did not substantially reduce the within cluster dispersion. K-means clustering was performed over 100 iterations, so that tags would be placed in clusters that best represent their expression trend. The most common clusters for each tag are displayed ( Figure 2). In only three instances, there were similar clusters in just two of the three biological replicates. Consequently, consistent changes in gene expression during progression were represented in 11 patterns. Differences among expression patterns for each biological replicate may be explained by biological variation, the probability of sampling a given LongSAGE tag, and/or imperfections in K-means clustering (e.g, variance may not be a good measure of cluster scatter).

Gene ontology enrichment analysis
We conducted Gene Ontology (GO) [34] enrichment analysis using Expression Analysis Systematic Explorer (EASE) [35] software to determine whether specific GO annotations were over-represented in the K-means clusters. Enrichment was defined by the EASE score (p-value ≤ 0.05) generated during comparison to all the other clusters in the biological replicate. This analysis was done for each biological replicate (3 mice: 13N, 15N, or 13R).
To enable visual differences between the 11 expression trends, the clusters were amalgamated into five major trends: group 1, up during progression; group 2, down during progression; group 3, peak in the RAD stage; group 4, constant during progression; and group 5, valley in RAD stage ( Figure 2). To be consistent, the GO enrichment data was combined into five major trends which resulted in redundancy in GO terms. To simplify the GO enrichment data, similar terms were pooled into representative categories. Categorical gene ontology enrichments of the five major expression trends are shown in Figure 3. These data indicate that steroid binding, heat shock protein activity, de-phosphorylation activity, and glycolysis all decreased in the stage that was RAD, but increased again in the stage that was CR. Interestingly, steroid hormone receptor activity continues to increase throughout progression. Both of these expression trends were observed for genes with GO terms for transcription factor activity or secretion. The GO categories for genes with kinase activity and signal transduction displayed expression trends with peaks and valleys at the stage that was RAD. The levels of expression of genes involved in cell adhesion rose in the stage that was RAD, but dropped again in the stage that was CR.
Altogether, genes with functional categories that were enriched in expression trends may be consistent with the AR signaling pathway playing a role in progression of Table 1 Composition of LongSAGE libraries Library S1885 S1886 S1887 S1888 S1889 S1890 S1891 S1892 S1893 0.1 S1892 / 13R-RAD S1893 / 13R-CR S1887 / 13N-CR S1890 / 15N-CR S1889 / 15N-RAD S1886 / 13N-RAD S1891 / 13R-AS S1885 / 13N-AS S1888 / 15N-AS prostate cancer to castration-recurrence ( Figure 3). For example, GO terms steroid binding, steroid hormone receptor activity, heat shock protein activity, chaperone activity, and kinase activity could represent the cytoplasmic events of AR signaling. GO terms transcription factor activity, regulation of transcription, transcription corepression activity, and transcription co-activator activity could represent the nuclear events of AR signaling. AR-mediated gene transcription may result in splicing and protein translation, to regulate general cellular processes such as proliferation (and related nucleotide synthesis, DNA replication, oxidative phosphorylation, oxioreductase activity, and glycolysis), secretion, and differentiation. It should be noted, however, that both positive and negative regulators were represented in the GO enriched categories ( Figure 3). Therefore, a more detailed analysis was required to determine if the pathways represented by the GO-enriched categories were promoted or inhibited during progression to CRPC. Moreover, many of the GO enrichments that were consistent with changes in the AR signaling pathway were generic, and could be applied to the other models of CRPC.

Consistent differential gene expression associated with progression of prostate cancer
Pair-wise comparisons were made between LongSAGE libraries representing the transcriptomes of different stages (AS, RAD, and CR) of prostate cancer progression from the same biological replicate (3 mice: 13N, 15N, or 13R). Among all three biological replicates, the number of consistent statistically significant differentially expressed tag types were determined using the Audic and Claverie test statistic [36] at p ≤ 0.05, p ≤ 0.01, and p ≤ 0.001 ( Table 2). The tags represented in Table 2 were included only if the associated expression trend was common among all three biological replicates. The Audic and Claverie statistical method is well-suited for LongSAGE data, because the method takes into account the sizes of the libraries and tag counts. Tag types were counted multiple times if they were over, or under-represented in more than one comparison. The number of tag types differentially expressed decreased by 57% as the stringency of the p-value increased from p ≤ 0.05 to 0.001.
Tag types consistently differentially expressed in pairwise comparisons were mapped to RefSeq (March 4th, 2008). Tags that mapped anti-sense to genes, or mapped ambiguously to more than one gene were not included in the functional analysis. GO, Kyoto Encyclopedia of Genes and Genomes (KEGG; v45.0) [37] pathway, and SwissProt (v13.0) [38] keyword annotation enrichment analyses were conducted using EASE (v1.21; March 11 th , 2008) and FatiGO (v3; March 11 th , 2008) [39] (Table 3). This functional analysis revealed that the expression of genes    13N, 15N, and 13R) and the results from the iterations were combined into consensus clusters shown here. Plotted on the x-axes are the long serial analysis of gene expression (LongSAGE) libraries representing different stages of prostate progression: AS, androgen-sensitive; RAD, responsive to androgendeprivation; and CR, castration-recurrent. Plotted on the y-axes are the relative expression levels of each tag type, represented as a percentage of the total tag count (for a particular tag type) in all three LongSAGE libraries. Different colors represent different tag types. Each of the ten clusters for each biological replicate are labeled as such. 'No equivalent' indicates that a similar expression trend was not observed in the indicated biological replicate. Eleven expression patterns are evident in total and are labeled on the left. K-means clusters were amalgamated into five major expression trends: group 1, up during progression; group 2, down during progression; group 3, peak in the RAD stage; group 4, constant during progression; and group 5, valley in RAD stage.
involved in signaling increased during progression, but the expression of genes involved in protein synthesis decreased during progression. Cell communication increased in the stage that was RAD but leveled off in the stage that was CR. Carbohydrate, lipid and amino acid synthesis was steady in the RAD stage but increased in the CR stage. Lastly, glycolysis decreased in the RAD stage, but was re-expressed in the CR stage (Table 3).
Tag types differentially expressed between the RAD and CR stages of prostate cancer were of particular interest (Table 4). This is because these tags potentially represent markers for CRPC and/or are involved in the mechanisms of progression to CRPC. These 193 tag types (Table 2) were mapped to databases RefSeq (July 9 th , 2007), Mammalian Gene Collection (MGC; July 9 th , 2007) [40], or Ensembl Transcript or genome (v45.36d) [41]. Only 135 of the 193 tag types were relevant (Table 4) with 48 tag types that mapped ambiguously to more than one location in the Homo Sapiens transcriptome/genome, and another 10 tag types that mapped to Mus musculus transcriptome/ genome. Mus musculus mappings may be an indication of minor contamination of the in vivo LNCaP Hollow Fiber model samples with host (mouse) RNA. These 135 tag types represented 114 candidate genes with 7 tag types that did not map to the genome, 5 tag types that mapped to unannotated genomic locations, and 9 genes that were associated with more than one tag type. Table 4 shows the LongSAGE tag sequences and tag counts per million tags in all nine libraries. Tags were sorted into groups based on expression trends. These trends are visually represented in    Tag Sequence S1885 S1886 S1887 S1888 S1889 S1890 S1891 S1892 S1893 Trend ‡ Gene** Accession § §  Additional file 1, Figure S3. Mapping information was provided where available. We cross-referenced these 114 candidate genes with 28 papers that report global gene expression analyses on tissue samples from men with 'castration-recurrent', 'androgen independent,' 'hormone refractory,' 'androgenablation resistant,' 'relapsed,' or 'recurrent' prostate cancer, or animal models of castration-recurrence . † Tag count per 1 million = (observed tag count/total tags in the library) × 1,000,000. ‡ Trends are visually represented from A to P in Additional file 1, Figure S3. In addition to p-value considerations, significantly different trends were also required to display uniform directions of change in each biological replicate. § AS, Androgen-sensitive. II RAD, Responsive to androgen-deprivation.
¶ CR, Castration-recurrent. ** Human Genome Nomenclature Committee (HGNC)-approved gene names were used when possible. Non-HGNC-approved gene names were not italicized. † † Tag maps antisense to gene. ‡ ‡ Gene is known to display this expression trend in castration-recurrence. § § Accession numbers were displayed following the priority (where available): RefSeq > Mammalian Gene Collection > Ensembl Gene. If the tag mapped to more than one transcript variant of the same gene, the accession number of the lowest numerical transcript variant was displayed.

Novel CR-associated genes identify both clinical samples of CRPC and clinical metastasis of prostate cancer
The expression of novel CR-associated genes were validated in publically available, independent sample sets representing different stages of prostate cancer progression (Gene Expression Omnibus accession numbers: GDS1390 and GDS1439). Dataset GDS1390 includes expression data of ten AS prostate tissues, and ten CRPC tissues from Affymetrix U133A arrays [47]. Dataset GDS1439 includes expression data of six benign prostate tissues, seven localized prostate cancer tissues, and seven metastatic prostate cancer tissues from Affymetrix U133 2.0 arrays [97].
Unsupervised principal component analysis based on the largest three principal components revealed separate clustering of tumor samples representing AS and CR stages of cancer progression, with the exception of two CR samples and one AS sample (Figure 4a).
Metastatic prostate cancer is expected to have a more progressive phenotype and is associated with hormonal progression. Therefore, the gene expression signature obtained from the study of hormonal progression may be common to that observed in clinical metastases. Unsupervised principal component analysis based on the largest three principal components revealed separate clustering of not only benign and malignant, but also localized and metastatic tissue samples (Figure 4b).

Discussion
Genes that change levels of expression during hormonal progression may be indicative of the mechanisms involved in CRPC. Here we provide the most comprehensive gene expression analysis to date of prostate cancer with approximately 3 million long tags sequenced using in vivo samples of biological replicates at various stages of hormonal progression to improve over the previous libraries that are approximately 70,000 short tags or less. Previous large-scale gene expression analyses have been performed with tissue samples from men with advanced prostate cancer [42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58], and animal or xenograft models of CRPC [59][60][61][62][63][64][65][66][67][68][69]. Most of these previous studies compared differential expression between CRPC samples with the primary samples obtained before androgen ablation. This experimental design cannot distinguish changes in gene expression that are a direct response to androgen ablation, or from changes in proliferation/survival that have been obtained as the prostate cancer cells progress to more a more advanced phenotype. Here we are the first to apply an in vivo model of hormonal progression to compare gene expression between serial samples of prostate cancer before (AS), and after androgen ablation therapy (RAD) as well as when the cells become CR. This model is the LNCaP Hollow Fiber model [21] which has genomic similarity with clinical prostate cancer [23] and mimics the hormonal progression observed clinically in response to host castration as measured by levels * Human Genome Nomenclature Committee (HGNC)-approved gene names were used when possible. Non-HGNC-approved gene names were not italicized. † S or PM, gene product is thought to be secreted (S) or localize to the plasma membrane (PM). ‡ Reg. by A, gene expression changes in response to androgen in prostate cells. § Spec. to P, gene expression is specific to-or enriched in-prostate tissue compared to other tissues. II CaP, gene is differentially expressed in prostate cancer compared to normal, benign prostatic hyperplasia, or prostatic intraepithelial neoplasia.
¶ GG, gene is differentially expressed in higher Gleason grade tissue versus lower Gleason grade tissue. ** Prog., gene expression correlates with late-stage prostate cancer or is a risk factor that predicts progression. † † Mets, gene expression is associated with prostate cancer metastasis in human samples or in vivo models. ‡ ‡ CR, gene is associated with castration-recurrent prostate cancer in human tissue or in vivo models, but exhibits an opposite trend of this report § § Y, yes; ↑, high gene expression; ↓, low gene expression. samples. Principal component analysis based on the expression of these genes also revealed separate clustering of the different stages of tumor samples and also showed separate clustering of the benign samples from the prostate cancer samples. Therefore, some common changes in gene expression profile may lead to the survival and proliferation of prostate cancer and contribute to both distant metastasis and hormonal progression. We used this LNCaP atlas to identify changes in gene expression that may provide clues of underlying mechanisms resulting in CRPC. Suggested models of CRPC involve: the AR; steroid synthesis and metabolism; neuroendocrine prostate cancer cells; and/or an imbalance of cell growth and cell death.

Androgen receptor (AR) Transcriptional activity of AR
The AR is suspected to continue to play an important role in the hormonal progression of prostate cancer. The AR is a ligand-activated transcription factor with its activity altered by changes in its level of expression or by interactions with other proteins. Here, we identified changes in expression of some known or suspected modifier of transcriptional activity of the ARin CRPC versus RAD such as Cyclin H (CCNH) [107], proteasome macropain subunit alpha type 7 (PSMA7) [108], CUE-domain-containing-2 (CUEDC2) [109], filamin A (FLNA) [110], and high mobility group box 2 (HMGB2) [111]. CCNH and PSMA7 displayed increased levels of expression, while CUEDC2, FLNA, and HMGB2 displayed decreased levels of expression in CR. The expression trends of CCNH, CUEDC2, FLNA, and PSMA7 in CRPC may result in increased AR signaling through mechanisms involving protein-protein interactions or altering levels of expression of AR. CCNH protein is a component of the cyclin-dependent activating kinase (CAK). CAK interacts with the AR and increases its transcriptional activity [107]. Over-expression of the proteosome subunit PSMA7 promotes AR transactivation of a PSA-luciferase reporter [108]. A fragment of the protein product of FLNA negatively regulates transcription by AR through a physical interaction with the hinge region [110]. CUEDC2 protein promotes the degradation of progesterone and estrogen receptors [109]. These steroid receptors are highly related to the AR, indicating a possible role for CUEDC2 in AR degradation. Thus decreased expression of FLNA or CUEDC2 could result in increased activity of the AR. Decreased expression of HMGB2 in CRPC is predicted to decrease expression of at least a subset of androgen-regulated genes that contain palindromic AREs [111]. Here, genes known to be regulated by androgen were enriched in expression trend categories with a peak or valley at the RAD stage of prostate cancer progression. Specifically, 8 of the 13 tags (62%) exhibiting these expression trends 'E', 'F', 'J', 'K', or 'L' represented known androgenregulated genes, in contrast to only 22 of the remaining 122 tags (18%; Tables 4 &5). Overall, this data supports increased AR activity in CRPC, which is consistent with re-expression of androgen-regulated genes as previously reported [68] and similarity of expression of androgen regulated genes between CRPC and prostate cancer before androgen ablation [23].

Steroid synthesis and metabolism
In addition to changes in expression of AR or interacting proteins altering the transcriptional activity of the AR, recent suggestion of sufficient levels of residual androgen in CRPC provides support for an active ligand-bound receptor [112]. The AR may become reactivated in CRPC due to the presence of androgen that may be synthesized by the prostate de novo [4] or through the conversion of adrenal androgens. Here, the expression of 5 genes known to function in steroid synthesis or metabolism were significantly differentially expressed in CRPC versus RAD. They are 24-dehydrocholesterol reductase (DHCR24) [113], dehydrogenase/ reductase SDR-family member 7 (DHRS7) [114], elongation of long chain fatty acids family member 5 (ELOVL5) [115,116], hydroxysteroid (17-beta) dehydrogenase 4 (HSD17B4) [117], and opioid receptor kappa 1 (OPRK1) [118]. Increased levels of expression of these genes may be indicative of the influence of adrenal androgens, or the local synthesis of androgen, to reactivate the AR to promote the progression of prostate cancer in the absence of testicular androgens.

Proliferation and Cell survival
The gene expression trends of GAS5 [125], GNB2L1 [126], MT-ND3, NKX3-1 [127], PCGEM1 [128], PTGFR [129], STEAP1 [130], and TMEM30A [131] were in agreement with the presence of proliferating cells in CRPC. Of particular interest is that we observed a transcript anti-sense to NKX3-1, a tumor suppressor, highly expressed in the stages of cancer progression that were AS and CR, but not RAD. Anti-sense transcription may hinder gene expression from the opposing strand, and therefore, represents a novel mechanism by which NKX3-1 expression may be silenced. There were also some inconsistencies including the expression trends of BTG1 [132], FGFRL1 [133], and PCOTH [134] and that may be associated with non-cycling cells. Overall, there was more support at the transcriptome level for proliferation than not, which was consistent with increased proliferation observed in the LNCaP Hollow Fiber model [21]. Gene expression trends of GLO1 [135], S100A10 [136], TRPM8 [137], and PI3KCD [138] suggest cell survival pathways are active following androgen-deprivation and/ or in CRPC, while gene expression trends of CAMK2N1 [139], CCT2 [140], MDK [141,142], TMEM66 [143], and YWHAQ [136] may oppose such suggestion. Taken together, these data neither agree nor disagree with the activation of survival pathways in CRPC. In contrast to earlier reports in which MDK gene and protein expression was determined to be higher in late stage cancer [63,142], we observed a drop in the levels of MDK mRNA in CRPC versus RAD. MDK expression is negatively regulated by androgen [65]. Therefore, the decreased levels of MDK mRNA in CRPC may suggest that the AR is reactivated in CRPC.

Other
The significance of the gene expression trends of AMD1, BNIP3, GRB10, MARCKSL1, NGRAP1, ODC1, PPP2CB, PPP2R1A, SLC25A4, SLC25A6, and WDR45L that function in cell growth or cell death/survival were not straightforward. For example, BNIP3 and WDR45L, both relatively highly expressed in CRPC versus RAD, may be associated with autophagy. BNIP3 promotes autophagy in response to hypoxia [144], and the WDR45L-related protein, WIPI-49, co-localizes with the autophagic marker LC3 following amino acid depletion in autophagosomes [145]. It is not known if BNIP3 or putative WDR45L-associated autophagy results in cell survival or death. Levels of expression of NGFRAP1 were increased in CRPC versus RAD. The protein product of NGFRAP1 interacts with p75 (NTR). Together they process caspase 2 and caspase 3 to active forms, and promote apoptosis in 293T cells [146]. NGFRAP1 requires p75 (NTR) to induce apoptosis. However, LNCaP cells do not express p75 (NTR), and so it is not clear if apoptosis would occur in this cell line [147].
Overall, genes involved in cell growth and cell death pathways were altered in CRPC. Increased tumor burden may develop from a small tip in the balance when cell growth outweighs cell death. Unfortunately, the contributing weight of each gene is not known, making predictions difficult based on gene expression alone of whether proliferation and survival were represented more than cell death in this model of CRPC. It should be noted that LNCaP cells are androgen-sensitive and do not undergo apoptosis in the absence of androgens. The proliferation of these cells tends to decrease in androgen-deprived conditions, but eventually with progression begins to grow again mimicking clinical CRPC.

Conclusion
Here, we describe the LNCaP atlas, a compilation of LongSAGE libraries that catalogue the transcriptome of human prostate cancer cells as they progress to CRPC in vivo. Using the LNCaP atlas, we identified differential expression of 96 genes that were associated with castration-recurrence in vivo. These changes in gene expression were consistent with the suggested model for a role of the AR, steroid synthesis and metabolism, neuroendocrine cells, and increased proliferation in CRPC.

Additional material
Additional file 1: Supplementary Figures. Figure S1: qRT-PCR analysis of KLK3 gene expression during hormonal progression of prostate cancer to castration-recurrence. RNA samples were retrieved from the in vivo LNCaP Hollow Fiber model at different stages of cancer progression that were: AS, androgen-sensitive, day zero (just prior to surgical castration and 7 days post-fiber implantation); RAD, responsive to androgendeprivation, 10 days post-surgical castration; and CR, castration-recurrent, 72 days post-surgical castration. MNE, mean normalized expression, calculated by normalization to glyceraldehyde-3-phosphate (GAPDH). Error bars represent ± standard deviation of technical triplicates. Each mouse represents one biological replicate. Figure S2: Ten K-means clusters are optimal to describe the expression trends present during progression to castration-recurrence. K-means clustering was conducted over a range of K (number of clusters) from K = 2 to K = 20 and the within-cluster dispersion was computed for each clustering run and plotted against K. The within-cluster dispersion declined with the addition of clusters and this decline was most pronounced at K = 10. The graph of within cluster dispersion versus K shown here is for mouse 13N, but the results were similar for mice 15N and 13R. Figure S3: Trend legend for Table 4. Gene expression trends of LongSAGE tags that consistently and significantly altered expression in CR prostate cancer are represented graphically with trends labeled A-P. * Statistics according to the Audic and Claverie test statistic (p ≤ 0.05).