Impact of polymorphisms in microRNA biogenesis genes on colon cancer risk and microRNA expression levels: a population-based, case-control study

Background MicroRNAs (miRNAs) have been implicated in the incidence and progression of cancer. It has been proposed that single nucleotide polymorphisms (SNPs) influence cancer risk due to their position within genes involved in miRNA synthesis and regulation. Methods Genes directly and indirectly involved in miRNA biogenesis were identified from the literature. We then identified SNPs within these regions. Using genome-wide association study data we evaluated associations between biogenesis-related SNPs with colon cancer risk and their corresponding mRNA expression in normal colonic mucosa and carcinoma and difference in expression between the two tissues. SNPs that were associated with either altered colon cancer risk or with mRNA expression were evaluated for associations with altered miRNA expression. Results Eleven SNPs were associated (P < 0.05) with colon cancer risk, and two of these variants remained significant after correction for multiple comparisons (PHolm < 0.05): rs1967327 (PRKRA) (ORdom = 0.78, 95 % CI 0.66–0.92) and rs4548444 (MAPKAP2) (ORrec = 1.67, 95 % CI 1.12–2.48). Of these two SNPs, rs4548444 (MAPKAP2), was associated with significantly altered miRNA expression levels in normal colonic mucosa, with nine miRNAs upregulated among individuals homozygous rare (GG) for rs4548444. One SNP associated with cancer prior to adjustment for multiple comparisons, rs11089328 (DGCR8), was associated with altered levels of hsa-miR-645 in differential tissue under the dominant model. Three SNPs, rs2740349 (GEMIN4) in carcinoma tissue, and rs235768 (BMP2) and rs2059691 (PRKRA) in normal mucosa, were significantly associated with altered mRNA expression levels across genotypes after multiple comparison adjustment. Rs2740349 (GEMIN4) and rs235768 (BMP2) were significantly associated with the upregulation of six and nine individual miRNAs in normal colonic mucosa, respectively. Conclusion Our data suggest that few of the SNPs in biogenesis genes we evaluated alter levels of mRNA transcription or colon cancer risk. As only one SNP both alters colon cancer risk and miRNA expression it is likely that SNPs influencing cancer do not do so through miRNAs. Because the significant SNPs were associated with downregulated mRNAs and upregulated miRNAs, and because each SNP was associated with unique miRNAs, it is possible that other mechanisms influence mature miRNA levels.


Background
Mature microRNAs (miRNAs) are non-coding RNA molecules,~22 nucleotides (nt) in length [1][2][3][4][5], which act as endogenous, post-transcriptional regulators of messenger RNAs (mRNAs). By binding to complementary mRNA molecules, they are able to impede mRNA translation or cause mRNA degradation [6], depending on the degree of complementarity shared between the miRNA and mRNA [7]. As such, miRNAs alter the translated protein product levels of these genes within the specific biological tissues and diseases [8] in which they are expressed. It has been suggested that single nucleotide polymorphisms (SNPs) within miRNA gene regions that are associated with cancer risk may function by altering miRNA expression [9]. Similarly, SNPs within any of the genes regulating miRNA biogenesis would have the potential to alter the expression of mature miR-NAs and subsequently cancer risk [10].
There are many transcriptional and post-transcriptional [11] modification steps necessary for producing mature miRNAs. Genes directly involved in splicing, exporting and RNA editing events [12], as well as accessory proteins [13], have the potential to impact miRNA expression levels. MiRNA biogenesis can be broken down into four main categories of events: transcription, nuclear processing, cytoplasmic processing and RNA-induced silencing complex (RISC) formation and loading [11,12]. We provide a brief overview of miRNA biogenesis.

Transcription
Primary-miRNAs (pri-miRNAs) are transcribed by RNA polymerase II [11,14], and produce transcripts of up to 1 kilobase (kb) in length [11]. They consist of potentially more than one co-transcribed precursor miRNA sequence (pre-miRNA), as illustrated by miRNAs organized in a polycistronic cluster [15]. Transcription factors, such as TP53 and MYC, and epigenetic regulation, such as histone modifications, can alter miRNA transcription [11]. Once transcribed, pri-miRNAs fold into hairpin structures, consisting of a stem sequence (~35 nt), a terminal loop and single-RNA strands at both the 5′ and 3′ ends [11,12]. These hairpin structures are recognized as substrates for the ribonuclease (RNase) III enzymes Drosha and Dicer [13].

Nuclear processing
Pre-miRNAs are generated by the cleavage of the pri-miRNA in the nucleus of the cell by the Microprocessor, which is comprised of Drosha and DiGeorge Syndrome Critical Region 8 (DGCR8) [12,16]. The stem and tail regions of the pri-miRNA are important for the first cleavage: DGCR8 binds to the pri-miRNA using these regions and serves to align Drosha with the pri-miRNA to cleave at the appropriate site, which is approximately 11 base pairs (bps), or 1 helical turn [17], away from the junction of the single and double strands within the hairpin [12]. GSK3β has been shown to phosphorylate Drosha in the cytoplasm before it enters the nucleus, which enables it to be localized to the nucleus and contribute to miRNA biogenesis [18]. Drosha-mediated cleavage results in a hairpin of~65-70 bps [11,13] and leaves behind a~2 nt overhang on the 3′ end, which is characteristic of RNase III enzymes [19], and makes it possible for the pri-miRNA to be recognized by and subsequently transported out of the nucleus and into the cytoplasm by the Exportin-5 (XPO5) [17,19] -Ran-GTP complex [7,12]. Adenosine-Deaminase, RNA-Specific (ADAR) genes act on pre-mRNA transcripts, editing adenosine to inosine, which acts similarly to guanosine during translation, potentially altering protein function [20]. Because ADARs work on nuclear, double-stranded RNA structures, it is likely that they edit pri-miRNA transcripts prior to their export from the nucleus. Also acting in the nucleus, transforming growth factor-β (TGF-β) and bone morphogenetic protein factors (BMPs) have been shown to increase Microprocessor activity through the recruitment of SMAD proteins and DDX5 to the miRNA transcript, enhancing processing by Drosha [12] and potentially increasing miRNA expression.

Cytoplasmic processing
Once in the cytoplasm, the pre-miRNA is cleaved by Dicer, which forms a complex with TRBP (or TARBP2, trans-activating response RNA binding protein) complex to produce the two strands of miRNA duplex [12]. Dicer removes the loop of the pre-miRNA hairpin, cleaving about 2 helical turns from the base of the hairpin [17], leaving behind~22 bps of the miRNA/*miRNA duplex (where * is pronounced "star" [17]) [12]. Serine phosphorylation of TRBP by mitogen-activated protein kinase (MAPK)/extracellular regulated kinase (ERK) has been shown to stabilize TRBP [13]. Dicer activity has been shown to be impaired by decreased levels of TRBP [13]; as such SNPs within TRBP or MAPK/ERK could potentially lead to altered processing of pre-miRNAs.

MiRISC formation and loading
Of the two strands of the mature miRNA duplex generated by Dicer cleavage, the guide strand is typically the strand with the more unstable 5′ end [12,21]. This strand is selected by the Argonaute (AGO1-4) protein and loaded into the RISC with the help of the RISCloading complex, which has been proposed to be comprised of Dicer and TRBP [11,13]. The other strand, the star strand, is usually degraded [17]. Loading of the guide strand into the complex stabilizes the miRNA [22], protecting it from degrading nucleases within the cytoplasm, and enables it to bind to its target mRNA to induce silencing of the transcript. Helicases, such as the DEAD-Box proteins (GEMIN3 and p68), GEMIN4 and MOV10, are important for miRNA duplex unwinding and miRISC formation and activity [12].
There are additional accessory genes associated with miRNA biogenesis. Trinucleotide repeat-containing proteins (TRNC6A or GW182, TRNC6B, TRNC6C) and fragile X mental retardation protein (FMRP, encoded by FMR1) are correlated with the presence of cytoplasmic p-bodies, which are involved in mRNA degradation by miRNAs, and can impact the effect of miRNAs on their targets [13]. MiRNA stability, and therefore expression, has been linked to mRNA target presence and ability to be bound to the miRISC [22,23]; as such, SNPs within genes that regulate mRNA degradation may impact mature miRNA expression levels. LIN28B regulates Microprocessor activity and LIN28 regulates dicing of pre-let-7 through the regulatory sequence "GGAG" in the pre-miRNA's terminal loop [12,13,24,25]; LIN28 binds to this region and recruits terminal uridine transferase (TUT4 or ZCCHC11), which subsequently uridylates the pre-miRNA and inhibits processing by Dicer [13,24,25].
In this study, we investigate the risk of developing colon cancer associated with SNPs found within miRNA-biogenesis genes. We also analyze each SNP with mRNA expression of its corresponding gene in normal colonic mucosa, carcinoma tissue and difference in expression between normal colonic mucosa and carcinoma to determine if these SNPs alter mRNA transcription. Based on these results, we analyze those SNPs associated with either altered colorectal cancer (CRC) risk or with altered mRNA expression of its corresponding gene with miRNA expression in normal colonic mucosa and differential miRNA expression between carcinoma tissue and normal colonic mucosa across genotypes. As genes involved in the biogenesis of miRNAs are involved in production of virtually all miRNAs, we hypothesize that SNPs in these genes will alter mRNA expression by causing aberrant transcription as well as alter colon cancer risk through altered levels of miRNA expression.

Study population
The study population consisted of individuals previously enrolled in a study of Diet, Lifestyle and Colon cancer at the University of Utah and the Kaiser Permanente Medical Research Program (KPMRP) [26] for whom Genome Wide Association Study (GWAS) and miRNA expression data were available. Study subjects included incident cases of colon cancer between the ages of 30 and 79 who were non-Hispanic white, Hispanic or African American and were able to provide a signed informed consent prior to participation in the study. All stages of tumor were included in the study population and in this analysis. The study was approved by the University of Utah Institutional Review Board for Human Subjects.
miRNA processing RNA (miRNA) was extracted from formalin-fixed paraffin embedded tissues and processed as previously described [27]. 100 nanograms (ng) total RNA was labeled with Cy3 and hybridized to Agilent Human miRNA Microarray V19.0 and were scanned on an Agilent Sur-eScan microarray scanner model G2600D using Agilent Feature Extract software v.11.5.1.1. Data were required to pass stringent quality control (QC) parameters established by Agilent that included tests for excessive background fluorescence, excessive variation among probe sequence replicates on the array, and measures of the total gene signal on the array to assess low signal. If samples failed to meet QC standards, the sample was repeated. If a sample failed QC assessment a second time the sample was deemed to be of poor quality and was excluded from down-stream analysis. The Agilent platform was found to be highly reliable (r = 0.98) and had reasonable agreement with NanoString [28] and excellent agreement with quantitative reverse transcription polymerase chain reaction (qRT-PCR) [29]. If data were missing from normal colonic mucosa but the tumor tissue was successfully scanned (N = 60), we imputed values for normal mucosa as previously described in [30]; this method of imputation has yielded results with high accuracy. To minimize differences that could be attributed to the array, amount of RNA, location on array or other factors that could erroneously influence expression, total gene signal was normalized by multiplying each sample by a scaling factor which was the median of the 75th percentiles of all the samples divided by the 75th percentile of each individual sample [31]. This scaling factor was implemented using SAS 9.4.

RNA-Seq sequencing library preparation
Total RNA was available from 197 carcinoma and normal mucosa pairs. These samples were taken from the study subjects used for miRNA analysis and were extracted, isolated and purified in the same manner as previously described [27]. RNA library construction was done with the Illumina TruSeq Stranded Total RNA Sample Preparation Kit with Ribo-Zero. The samples were then fragmented and primed for complementary DNA (cDNA) synthesis, adapters were then ligated onto the cDNA, and the resulting samples were then amplified using PCR; the amplified library was then purified using Agencount AMPure XP beads. A more detailed description of the methods can be found in our previous work [32]. Of these, 175 passed QC based on acceptable number of sequence reads for both carcinoma tissue and normal mucosa. Of these, 71 subjects also had GWAS data available for comparison with mRNA and miRNA expression data in carcinoma tissue, 67 for normal colonic mucosa, and 61 for the difference between normal colonic mucosa and carcinoma tissue.

RNA sequencing and data processing
Sequencing was done using an Illumina TruSeq v3 single read flow cell and a 50 cycle single-read sequence run was performed on an Illumina HiSeq instrument. Reads were then aligned to a sequence database containing the human genome (build GRCh37/hg19, February 2009 from genome.ucsc.edu) and alignment was performed using novoalign v2.08.01. Python and a pysam library were used to calculate counts for each exon and UTR of the genes using a list of gene coordinates obtained from http://genome.ucsc.edu. We dropped features that were not expressed in our data or for which the expression was missing for the majority of samples. A more detailed description of the methods can be found in our previous work [32].

Targeted SNPs from GWAS for biogenesis genes SNP identification and bioinformatics analysis
A literature search was conducted to identify genes involved in miRNA biogenesis, as well as SNPs within biogenesis genes that alter cancer risk, by searching PubMed and standard search engines for peer-reviewed articles containing keywords "miRNA" and "biogenesis", and the subsequent addition of other words that appeared in preliminary searches or those that would yield more specific results such as "degradation", "processing", "polymorphism", "SNP" and "cancer" [10-13, 16, 17, 20, 22, 23, 33, 34]. This included the all genes directly associated with miRNA biogenesis (such as DROSHA, DICER, DGCR8, TARBP2, AGO1-4, GEMIN3/ 4) as well as accessory genes, or genes indirectly associated with biogenesis (such as ADARs, BMPs, SMADs, FMR1, DDXs, DHXs, LIN28, MOV10, ZCCHC11, MAPKs, GSK3β, p38, p54). NCBI SNP (http:// www.ncbi.nlm.nih.gov/snp/) [35] was used to identify SNPs within these genes. We filtered results for 'only Human active' , 'snp' , with a custom minor allele frequency (MAF) range of 0.01-1.0; some gene-SNP associations were identified from the literature and as such were not screened for these criteria. This list represents the majority of SNPs within major human miRNA biogenesis genes. To better evaluate the impact of these SNPs on the expression of the proteins, we utilized Ensembl's Variant Effect Predictor (VEP) tool (http://grch37.ensembl.org/info/docs/tools/ vep/index.html) [36]; we used the archived, GRCh37, which corresponds to our GWAS and mRNA data.

GWAS genotyping
GWAS data were obtained using Illumina HumanHap 550, 610 K as part of the GECCO study and has been described previously [37]. Imputation to HapMap2 Release 24 was performed using MACH, which was imputed to HapMap Release 22 using BEAGLE. All SNP coordinates were defined using hg19/GRCh37. We excluded SNPs from consideration that failed Illumina quality measures or standard quality control procedures [38]. We further excluded SNPs that showed limited variability in our data. In total, we evaluated 219 SNPs with colon cancer risk; those that were significant were then evaluated with all miRNAs expressed in normal colonic mucosa in at least 5 % of the population. The final list of included SNPs can be seen in Table 1.

Statistical analysis
Our sample consisted of 401 cases who had both miRNA and SNP data and 1115 cases and 1173 controls with SNP data for colon cancer risk assessment. We used a logistic regression model adjusted for age, sex and study center to identify SNPs associated with colon cancer risk; we report odds ratios (OR) and 95 % confidence intervals (CI) from those models. We adjusted for multiple comparisons using the step-down Bonferroni correction [39] based upon the effective number of independent SNPs per chromosome as determined using the SNP spectral decomposition method proposed by Nyholt [40] and modified by Li and Ji [41]. SNPs that were significantly associated with colon cancer risk were then assigned either a dominant or recessive model. SNPs were also compared with mRNA RNA-Seq expression data. Out of the 219 candidate SNPs, 215 had sufficient variant allele expression to evaluate with mRNA expression. We required at least one subject with RNA-Seq expression in both tumor and normal tissue and to have the heterozygous genotype to evaluate a dominant model or the homozygous variant genotype to evaluate both the dominant and recessive models. The means of the mRNA expression data for each tissue type as measured by the log base 2 of the RPKM (Reads per Kilobase per Million) were compared in both the dominant and recessive (where possible) genotypes of each SNP. The p-values are based upon 10,000 permutations of the genotype using the coin package in R, and the results were adjusted for multiple comparisons at the chromosomal level using the false discovery rate (FDR) [42] level of 0.05. After adjustment for multiple comparisons, any significant SNPs were then combined with the SNPs found to be significantly associated with colorectal cancer risk and evaluated for associations with miRNA expression levels in normal colonic mucosa as well as with differential miRNA expression between carcinoma tissue and normal colonic mucosa. We compared log base 2 transformed expression levels across selected genotype models using the significance analysis of microarrays (SAM) technique in the R package siggenes [43], p-values were based upon 1000 permutations with an FDR level of 0.10. For siggenes, SNPs were evaluated as either dominant or recessive based on previous findings, since a two-level outcome is more interpretable. Bioinformatics analysis utilized UCSC Table Browser [44] to obtain all SNP coordinates; all coordinates are from the GRCh37 assembly.

Results
Eleven SNPs were associated (P < 0.05) with colon cancer risk ( rs2274147, rs835036 C Uridylation of Pre-miRNA/Dicer Inhibition a Related SNPs are those in high linkage disequilibrium (r b > 0.8). These are: rs7460 (rs11996715); rs197381 (rs197383); rs13175906 (rs13186629); rs2330696 (rs6450839); rs6883386 (rs2161006, rs17404622); rs11587947 (rs11581746); rs12741800 (rs3811463); rs7291691 (rs9623117) b Some of these genes reflect actual location (SNP is within gene), others reflect literature associations c AGO genes are also known by EIF2C (i.e.AGO1 is also known by EIF2C1, etc.…) d N: Nucleus; C: Cytoplasm e dbSNP lists this SNP's position as LOC101927027; it is associated in the literature with PRKRA in the literature (falling within an accepted range of PRKRA) f These SNPs were either not evaluated for association with CRC due to unavailability of GWAS data Results from VEP showed that five SNPs within four genes had a predicted 'moderate' effect, meaning that the SNP is a "non-disruptive variant that might change protein effectiveness" [36]. These SNPs were: rs235768 (BMP2), rs2740348 and rs2740349 (GEMIN4) (which are in high Linkage Disequilibrium (LD) but both included for their separate results in Tables 3 and 4), rs449643 (SKIV2L) and rs1140409 (DDX5). Two SNPs, rs1106841 and rs2257082 (XPO5), had a predicted 'low' effect, meaning that they are assumed "to be mostly harmless or unlikely to change protein behavior" [36]. All other SNPs were categorized as having a 'modifier' impact, meaning that there is little evidence of impact due to the SNPs being in non-coding regions [36].
Twenty-four unique miRNAs were dysregulated in total with SNPs that were either significantly associated with altered mRNA expression or cancer risk. Twentythree of these were seen in normal colonic mucosa across genotypes; eight of these were associated with rs4548444 (MAPKAPK2) under the recessive genotype, six with rs2740349 (GEMIN4) under the dominant genotype, and nine with rs235768 (BMP2) under the recessive genotype (Table 4). One miRNA, hsa-miR-645, was seen to be significantly associated with rs11089328 (DGCR8) in differential tissue expression under the dominant genotype.

Discussion
As the vast majority of mature miRNAs are generated by a group of genes either directly or indirectly related to miRNA biogenesis, SNPs within these genes could alter miRNA expression and subsequently colon cancer risk. Of the 219 SNPs within the miRNA biogenesisrelated genes we analyzed, 11 SNPs were significantly associated with colon cancer. Three of these were significantly associated with colon cancer after correction for multiple comparisons. Of these, only DGCR8 rs11089328 and MAPKAPK2 rs4548444, were associated with altered miRNA expression when an FDR level of 0.1 was applied.
We previously reported, in a study with a larger sample size, that rs3178250 (BMP2) was associated with increased risk (OR TC/CC 1.20 95 % CI 1.05, 1.38) of developing colon cancer, as well as rectal cancer (OR CC 1.63 95 % CI 1.02, 2.60) [45]. The larger sample of approximately 500 cases provided adequate power to detect a significant association for this SNP. BMP2 is a member of the TGF-β superfamily which has been associated with colorectal cancer [46]. Currently to our knowledge there are no significant associations between the other significant SNPs we identified and colorectal cancer in the literature.
Of the 212 SNPs evaluated with mRNA expression, 48 SNPs within 18 genes were associated with altered mRNA expression in either normal colonic mucosa, carcinoma tissue or difference in expression between normal colonic mucosa and carcinoma tissue; three of these remained significant after adjustment for multiple comparisons. One of these three SNPs, rs2740349 (GEMIN4), was associated with altered mRNA expression in carcinoma tissue, and the two other SNPs, rs2059691 (PRKRA) and rs235768 (BMP2), were associated with altered levels of mRNA expression in normal colonic mucosa across genotypes. Both rs2740349 (GEMIN4), under the dominant model and rs235768 (BMP2), under the recessive model, had less mRNA expression with the variant allele. Rs2059691 (PRKRA) had   were both shown to be associated with altered levels of miRNA expression and in both cases all miRNAs were upregulated in the corresponding variant genotypes. Conversely, rs2059691 (PRKRA) was not predicted to have any effect, and we saw increased levels of mRNA expression in the variant genotype, and no subsequent miRNA expression associations. MAPKAPK2 rs4548444 was associated with an increase in mean miRNA expression in normal colonic mucosa tissue, however this increase was seen with only the homozygote rare genotype. This SNP is associated with increased risk of colon cancer (OR GG 1.61 95 % CI 1.08, 2.40). MAPK/ERK proteins have been shown to phosphorylate TRBP, stabilizing it and potentially increasing Dicer-mediated processing of pre-miRNAs. A SNP in MAPKAPK2 could cause an altered binding affinity of MAPKAPK2 to TRBP, thereby impacting mature miRNA production. As we did not see any association between rs4548444 and MAPKAPK2 mRNA expression, and we saw an increase in miRNA expression in the GG (homozygous rare) genotype, this indicates that while the SNP is associated with colon cancer risk and miRNA expression, it is not doing so through altered miRNA biogenesis. MAPKAPK2 is involved in many cellular processes, including stress and inflammatory response, gene regulation and cell proliferation, and nuclear export [47]; as such, it is possible that the increased risk of colon cancer is associated with another biological process that MAPKAPK2 regulates.
Using Ensembl's VEP to predict the effect of each SNP on its corresponding gene showed that very few SNPs were predicted to have any effect. There were five SNPs within four genes that had a 'moderate' effect. Of these, only one was seen to be associated with altered colon cancer risk, rs2740348 (GEMIN4), which is in LD with rs2740349 (GEMIN4). While there were no significant associations were seen between rs2740348 and miRNA expression or mRNA expression, we did see significant findings after correction for multiple comparisons for mRNA and for miRNA expression with rs2740349 (GEMIN4). However, miRNA expression was upregulated in the same genotype with which mRNA expression was downregulated; this suggests that this SNP increases miRNA expression. Because the helicase GEMIN4 is thought to aid in miRNA duplex unwinding [12], a reduction in GEMIN4 transcription should theoretically result in reduced miRNA biogenesis. Similarly, BMPs have been thought to enhance Drosha cleavage [12], thereby increasing mature miRNA levels. As we see that the opposite is the case, and that miRNA levels increase with the reduction of BMP2 and GEMIN4 transcription, it is possible that other mechanisms influence miRNA biogenesis, or these genes have additional roles in biogenesis.
Interestingly, all of the miRNAs that were dysregulated were unique, in that no one miRNA was associated with more than one SNP. None of these miRNAs is listed as a known target in miRTarBase, a repository of validated miRNA target associations (http://mirtarbase.mbc.nctu.edu.tw/) [48] at this time, and there are no commonalities of chromosomal location between the SNPs and their respectively associated miRNAs. This suggests that miRNA biogenesis may be specific to the level of the individual miRNA, and the reason these specific groups of miRNAs are dysregulated across genotypes is because these genes affect these specific miRNA's production. Winter et al. describe findings to support that "specific helicases may regulate miRNAs differentially" [12]; this would support subsets of miRNAs associated with different SNPs. Perhaps other proteins have a similar influence on specific miRNA biogenesis. Biogenesis genes contribute to the processing of virtually every mature miRNA, however miRNA expression is thought to be tissue specific. One limitation of our study is that our findings could be influenced by our use of tissue from colon cancer patients, evaluating both normal colonic mucosa as well as differentially expressed miRNAs between carcinoma and normal colonic mucosa. Since we only evaluated associations with miRNA expression levels for SNPs associated with colon cancer risk or mRNA expression, other SNPs in biogenesis genes could alter miRNAs in other situations. It should also be kept in mind that many of the accessory genes to miRNA biogenesis are involved in multiple biological pathways. Thus, it is possible that SNPs could alter colon cancer risk through multiple mechanisms. Additionally, because we only looked at more commonly occurring SNPs (MAF ≥0.01) and we required there to be at least one subject in our dataset with the homogeneous variant genotype in order to analyze the dominant and recessive genotypes and at least one subject with the heterozygous genotype to evaluate the dominant model and overall CRC risk, another possible limitation is that more rare, and possibly more deleterious, SNP associations were not evaluated in this study. We hypothesized that SNPs within biogenesis genes could impact the transcription of these mRNAs, and therefore impact miRNA expression. Out of the 212 SNPs evaluated with mRNA expression only three were significantly associated with altered expression after correction for multiple comparisons. This indicates that, in colon cancer, the majority of these SNPs do not negatively impact biogenesis gene transcription. While other mechanisms could still alter cellular protein levels, and thus possibly miRNA biogenesis, within the cell, we are not able to measure this. We also hypothesized that SNPs in biogenesis genes would be associated colon cancer risk and that miRNA expression levels would be influenced by the genotype of these SNPs. However, out of 219 SNPs within 48 genes, only one SNP was associated with both altered colon cancer risk and altered miRNA expression levels. Because the SNP that was associated with both colon cancer risk and miRNA expression belongs to the MAPK family, and as such is involved in many processes other than miRNA biogenesis, the impact on cancer and even on miRNA expression may be due to other mechanisms unrelated to miRNA biogenesis. This finding suggests that SNPs in miRNA biogenesis genes have minimal impact on colon cancer risk and those that were associated have minimal associations with miRNAs.

Conclusion
Our data suggest that few of the SNPs in biogenesis genes we evaluated alter levels of mRNA transcription or colon cancer risk. As only one SNP both alters colon cancer risk and miRNA expression it is likely that SNPs influencing cancer do not do so through miRNAs. Because the significant SNPs were associated with downregulated mRNAs and upregulated miRNAs, and because each SNP was associated with unique miRNAs, it is possible that other mechanisms influence mature miRNA levels.

Ethics and consent to participate
All participants signed an informed consent and this study was approved by the Institutional Review Board at the University of Utah; the committee numbers for this paper are IRB_00055877 and IRB_00002335.

Consent to publish
Not applicable.

Availability of data and materials
Utah SNP data are available in NCBI's dbGaP repository (http://www.ncbi.nlm.nih.gov/gap) under the accession number phs000410.v1.p1. Due to restrictions in the signed consent forms, microarray data cannot be released at this time.