Next-generation technologies such as targeted re-sequencing platforms are powerful tools for identifying genetic variations in cancer samples. Using prostate cancer as an example, we have assessed the use of different kinds and amounts of tissue samples for identifying genetic variations. In particular, we have investigated three aspects which are frequently addressed from oncologists and pathologists:
The first is whether or not it is possible to use FFPE material in addition to snap frozen material. The use of FFPE material would open up a large collection of tissue samples for molecular studies since most of the materials stored at pathology departments around the world are archived in this way. However, the preparation procedure of FFPE tissue with formaldehyde fixation and long-term storage at room temperature may generate DNA mutations and result in the identification of false SNVs or InDels. We previously showed that it is possible to use FFPE material for copy number analysis of whole genome data, although a higher sequencing capacity is required to achieve a comparable coverage . Now we have extended our studies to targeted enrichment methods and found an uniform enrichment irrespective of the kind of tissue material used. Looking at the numbers of SNVs detected we found 0.98% false positive SNVs in FFPE preparations at a coverage level of 20× which can be strongly reduced at higher coverages (> 80×). Potential false positive SNVs can be explained by processes likely to occur during formalin fixation, like deamination and depurination processes. Our data suggests that the damage done by the FFPE preparation has a random distribution across all DNA fragments and can be corrected by sequencing depth. Since coverage levels of 80× and higher can easily be reached by targeted re-sequencing approaches, we recommend to use such high coverages when analyzing FFPE material. The same holds true for false negative SNVs. Keeping in mind that SNV detection is the main focus of DNA sequence analysis in cancer, the detection of small insertions and deletions becomes increasingly important. We therefore investigated if preparation of DNA from FFPE tissue may have an adverse effect on InDel detection. While the relative amount of discordant InDel positions is about 7 times higher than the amount of discordant SNV positions, we observed the same low discrepancy rates at higher coverage levels. Again, no discordance was found at a coverage level of 80×. Taken together, snap frozen tissues remain the preferred source of DNA, but FFPE tissue can be used for SNV and InDel detection instead if the coverage is increased. Furthermore, for certain clinically relevant questions, like for the detection of germline variants, e.g. when for a snap frozen tumor tissue no adequate matching benign tissue material is available, FFPE tissues can be used. In this case, the positive error rate obtained with FFPE material plays an inferior role.
The second methodological issue relates to the amount of material required. Decreasing the input amount of DNA to 500 ng still yielded good enrichment results, an even coverage and a highly reproducible calling of known genetic variants. However, we find increased redundant reads (reads with identical first positions) and a slightly higher variance of variant/reference ratios with decreased amounts of starting material. This suggests that - with these enrichment technologies - the minimal amount of input DNA cannot easily be reduced beyond 500 ng. Notably, the comparison among average and high amounts of DNA (1.5 μg vs 3 μg) performed better than a comparison including the lowest amount of DNA (500 ng).
While InDels detected show a variant/reference ratio distribution clearly deviating from the expected bimodal distribution and visible differences for the three DNA amounts, InDels are still highly reproducible above a coverage level of 45× for all amounts of DNA. We conclude that a decrease to 500 ng of input DNA is possible, but the benefit has to be weighed against the high coverage demands and potential challenges to SNV and InDel categorization.
The third challenge presented in our study consists of the heterogeneity of tumor tissue. In order to obtain results representative for the whole tumor, the amount and location of biopsies necessary is unknown. So far, it is not decided whether primary prostate cancers have a multifocal origin and thus are composed of multiple genetically distinct cancer cell clones or not. Currently, an independent clonal nature of multiple foci is considered since healthy men below 40 years frequently show presence of focal histological aberrations [34–36] many of which give rise only to latent prostate cancer, while clonal evolution of a few foci paves the way to clinically detectable disease [33, 37–40]. On the other hand, prostate cancer metastases from different locations but from the same patient show a surprisingly similar pattern with regard to copy number alterations [41–43]. Experiments available to address this question include the determination of the DNA ploidy, micro-satellite analysis, c-myc amplifications with FISH, DNA methylation or the TMPRSS2-ERG fusion status on separate tumors within the same prostate . In our hands, using samples derived from different foci within one prostate tumor and performing DNA re-sequencings of prostate cancer relevant genes, we found almost identical distributions of mutations within different foci of the same patient. Notably, SNV profile concordance was 100% for all three patients at coverage levels above 20×. Even tumor parts with different TMPRSS2-ERG gene fusion status are remarkably identical with regard to small nuclear variations. In addition, focusing on somatic mutations, we find no differences between different tumor foci. However, although we focused on prostate cancer candidate genes, the low number of somatic mutations in prostate cancer and the fact that we only analyzed ~10% of the exome prohibit a generalized conclusion. Recent studies, such as Taylor et al with 0.31, Kan et al with 0.33, and Berger et al with 0.9 non-synonymous mutations per Mb, suggest low somatic mutation rates per Mb for prostate cancer [8, 9, 45]. In line with this somatic mutation frequency we found only one somatic mutation for each of the three patients. The sensitivity of current re-sequencing approaches might further explain the missing focal diversity. Irrespective of the low frequency of somatic mutations we detected in the tumor samples we found large aberrations in copy number. We have used a whole genome re-sequencing approach to detect somatic copy number variations for each focus and compared the two foci from the same tumor. Interestingly, for one patient with clear differences in the TMPRSS2-ERG fusion pattern, we also find significant differences between the two foci, whereas for two other patients no significant CNVs can be detected. Along this line Navin et al. used a modified comparative genomic hybridization (CGH) technology to study the clonal composition of breast tumors and found a large proportion of monogenomic tumors and only a small fraction of tumors with a heterogenomic foci structure . Our results would implicate that the location of biopsies taken within tumors is of minor relevance for the detection of mutations, but plays a major role for the detection of copy number variations. Within this direction, recent publications also suggest that genomic rearrangements are a major genetic factor underlying prostate cancer . Since we did not perform 3D reconstructions of the whole tumors our approach cannot be used to answer the question of multifocal origin of heterogeneous prostate tumors. Even for the estimation of tumor heterogeneity our studies are most likely an underestimation, because we are investigating tissue samples with a complex composition of single cells. Thus, the genetic profiles are the sums over all cells contained within the section and might mask the true tumor heterogeneity. At the moment we are extending our analysis onto a single cell level to further gain insight into the evolutionary architecture of prostate tumors. With this we might be able to pin down the true tumor composition and we might even identify tumor stem cells on a genetic level. However, since we find differences between different biopsies from the same tumor on a copy number level, we can conclude that several biopsies need to be investigated to gain insight into the genomic context of prostate cancers based the overall tumor heterogeneity.
Furthermore, with the technologies described we are now in the progress to extend our analyses to large sample cohorts from pathology departments where we can select tissue specimens from specific clinical studies. This enables us to address clinical relevant questions such as progression and therapy resistance of tumors which is an important step towards the application of targeted re-sequencing approaches as routine diagnostic tools in oncology.