Susceptibility loci for pancreatic cancer in the Brazilian population

Background Pancreatic adenocarcinoma (PA) is a very aggressive cancer and has one of the poorest prognoses. Usually, the diagnosis is late and resistant to conventional treatment. Environmental and genetic factors contribute to the etiology, such as tobacco and alcohol consumption, chronic pancreatitis, diabetes and obesity. Somatic mutation in pancreatic cancer cells are known and SNP profile by GWAS could access novel genetic risk factors for this disease in different population context. Here we describe a SNP panel for Brazilian pancreatic cancer, together with clinical and epidemiological data. Methods 78 pancreatic adenocarcinoma and 256 non-pancreatic cancer subjects had 25 SNPs genotyped by real-time PCR. Unconditional logistic regression methods were used to assess the main effects on PA risk, using allelic, co-dominant and dominant inheritance models. Results 9 SNPs were nominally associated with pancreatic adenocarcinoma risk, with 5 of the minor alleles conferring protective effect while 4 related as risk factor. In epidemiological and clinical data, tobacco smoking, diabetes and pancreatitis history were significantly related to pancreatic adenocarcinoma risk. Polygenic risk scores computed using the SNPs in the study showed strong associations with PA risk. Conclusion We could assess for the first time some SNPs related with PA in Brazilian populations, a result that could be used for genetic screening in risk population such as familial pancreatic cancer, smokers, alcohol users and diabetes patients.

family history of pancreatic cancer have a higher risk of developing the disease and genetic susceptibility may be related to germinal mutations in known genes for hereditary cancer including CDKN2A, BRCA2, PALB2, STK11 and PRSS1 [6]. CDKN2A mutation are important in sporadic and familial events and it is estimated that this gene is altered in more than 90% of PA, with 0.6-3.3% of cases described to carry deleterious germline mutations in this gene [12,13]. BRCA2 also represents a hot-spot for rare variants/mutation for risk factor in PA [14,15].
Also single nucleotide polymorphisms (SNPs) have been extensively studied for a possible association with the risk of PA, for example, polymorphisms in the cytochrome P450 enzyme (CYP2A6) have been linked to an increased risk of sporadic PA (independent of smoking) [16].
More recently, genome-wide association studies (GWAS) have identified common variants associated with risk of PA mainly in North American, European and Asian populations [17][18][19][20]. These studies highlight different loci but their frequency and PA risk association in the Brazilian population is unknown.
Based on this, the present study evaluated 25 SNPs, previously associated with PA risk in GWAS to investigate the influence of these loci in the Brazilian population, including 78 patients with pancreatic adenocarcinoma and 256 controls without cancer history. From the analyzed loci, 10 variants were associated with PA risk in some of the models analyzed, highlighting the importance of these regions.

Study population
In this prospective and consecutive study, we used 78 PA patients recruited from 2018 to 2019 with confirmation by histopathology and/or surgery provided from Academic Biobank of Research on Cancer from the University of São Paulo, located in Centro de Investigação Translacional em Oncologia, Instituto do Câncer do Estado de São Paulo (ICESP), São Paulo, Brazil. The Biobank protocol was approved by the Local Ethics Committee (CEP no. 031/12 and National Ethics Committee (CONEP no.023/2014). As control we used 256 subjects with non-pancreatic cancer, healthy blood donors or orthopedic patients provided from Hospital do Trabalhador, Curitiba PR, Brazil, with Local Ethics Committee (CEP CAAE no. 77979417.8.0000.5248 and 77979417.8.3001.5225) and National Ethics Committee (CONEP 77979417.8.0000.5248) approval. All approvals contemplated demographic and epidemiological data collection for both groups, while for PA cases clinical data were also collected. For all participants, the project was described and informed consent form was obtained in writing format. All the participants had 4 mL of peripheral blood collected and buffy-coat DNA was extracted with QIAmp DNA Blood Mini Kit (QIAGEN) as indicated. A quantification and purity of DNA were performed using NanoDrop One/OneC Microvolume UV Spectrophotometer ® (Thermo Scientific).

Genotyping
The SNP genotyping was conducted in the Genomic Epidemiology laboratory at the German Cancer Research Center (DKFZ), Heidelberg using TaqMan (ABI, Applied Biosystems, Foster City, CA) and KASP (KBioscence, Hoddesdon, UK) Technologies and TaqMan Genotyping Master Mix (Applied Bioscience) technology, according to the manufacturers' instructions. All samples were included in a 384-well plate. For quality control, duplicates of 5% of the samples were included. Polymerase chain reaction plates were read on a ViiA7 real time instrument (Applied Biosystems). The ViiA7 RUO Software, version 1.2.2 (Applied Biosystems), was used to determine genotypes. The genotyping concordance between duplicate samples exceeded 99%, and samples with a call rate lower than 75% were discarded from the statistical analysis. rs35226131 was monomorphic in our population, therefore it was not included in further analyses.

Statistical analysis
Chi-square tests were used to compare sex, ethnicity, smoking and alcohol use, diabetes, pancreatitis between cases and controls, while for age we used t-student test, all conducted with Prism GraphPad. Hardy-Weinberg equilibrium was assessed in control subjects for each polymorphism. For each SNP, the more common allele in controls was assigned as the reference category. All analyses were adjusted for age and sex. Unconditional logistic regression methods were used to assess the main effects for the 25 selected genetic polymorphisms on PA risk, using allelic, co-dominant and dominant inheritance models. We used a p < 0.05 threshold to assess statistically significant associations between SNPs and PA risk. Chisquare and Fisher's exact test was used to compare allele frequency between ethnic ancestry from PA patients, controls and reported in database, with statistically significant by p < 0.05.

Polygenic risk score
We used the SNPs investigated in this study to assemble a polygenic risk score (PRS). We included all SNPs except rs684559 and rs353630, which were originally not reported to be associated with PA risk but rather with survival. For each SNP the number of alleles associated with higher PA risk were counted and added up for each study subject, resulting in an unweighted PRS. Additionally, we built a weighted PRS by using the ORs of the original GWASs. For each SNP in the weighted PRS a value of 0 was assigned if 0 risk alleles were present, ln(OR) was assigned if 1 risk allele was present, and 2*ln(OR) if 2 risk alleles were present. Then all the values were summed among them for each subject. Only a subset of the study subjects (67 cases and 228 controls) had a 100% SNP call rate. Therefore, in order to be able to compute comparable score values for all study subjects, we also considered "scaled" scores, in which the PRS values for each subject were multiplied by the ratio between the total number of SNPs and the number of effectively genotyped SNPs for the subject in question. For both PRSs (weighted and unweighted) we calculated quintiles based on the distribution of values in the controls.
The formulas for the unweighted and weighted scores are respectively Additionally, we created also PRSs using only the 9 SNPs that show association with PA in this population. We analyzed the association between the quintiles of PRSs and PA risk by logistic regression, adjusting for age and sex. Table 1 summarizes the epidemiological data for both groups. For the age, PA patients shows a mean age of 62.46 years old and the median age was 62, while for the control group the ages were 56.62 and 57, respectively. Age was not statistically different between cases and controls. The gender distribution was very similar with slightly more females among both cases and controls, which was also not statistically different. About ethnicity, collected as a self-reported variable, European ancestry people were more frequent in both groups (66.7% in PA and 81.3% in controls), while African ancestry people were more frequent in PA than controls (32% and 17.5%, respectively), again not statistically different.

Study population data
Other epidemiological data such as tobacco and alcohol usage, diabetes and personal pancreatitis history are also shown in Table 1. Statistical analysis showed a significant association between tobacco use (p = 0.002), diabetes (p < 0.0001) and pancreatitis history (p < 0.0001) and PA risk while alcohol use and familial PA history was not significant associated.
When we look for clinical data of the pancreatic cancer patients, almost 75% of then had the tumor located at pancreas head, while 8% was located in pancreatic body and 7% in tail and tail/body. For all 78 pancreatic cancer patients, 38 (49%) were submitted to lymph node dissection, and 24 of then (64%) present positivity with different ratio (Fig. 1). For treatment, 33% of the patients were treated with FOLFIRINOX, while 12% were submitted to surgery and 9% treated with gemcitabine.
For the genotyping analysis, all SNPs were in HWE in controls (p > 0.05). Results of association analysis between SNPs and PA risk are shown in Table 2. We found 8 SNPs that were nominally associated with pancreatic cancer risk (p < 0.05) with allelic model analysis and one more SNP with both codominant and dominant model analysis. Of these, the minor allele in 5 SNPs showed a protective effect for PA (OR < 1), while for 4 SNPs the minor allele was associated with increase in risk (OR > 1). The most significant findings were related to SNPs rs3790844, rs9854771, rs2941471, rs401681, rs13303010 and rs9543325. For the first one the minor allele represents a protective effect in pancreatic cancer patients. In the same way, in the SNP rs9854771 the minor allele also represents a protective effect. The third SNP where the minor allele represents a protective effect is rs2941471. In a different way, at the SNP rs401681 the minor allele represents a risk factor for pancreatic cancer. At the SNP rs13303010 the minor allele also represents a risk factor for pancreatic cancer, again showed in SNP rs9543325. The complete results for analysis of the SNPs are shown in Table 2.
In Table 3 we demonstrate MAF for all nine statistically different SNPs obtained in dbSNP (https:// www. ncbi. nlm. nih. gov/ SNP/) database for African, European and Asian population. In the same table we demonstrate MAF for both PA patients and controls in the Brazilian population we studied, divided by ethnic ancestry.
All PRSs were associated with an increase in risk of PA, as expected. When we computed the association between the PRSs and PA risk considering only 67 cases and 228 controls with a call rate of 100%, we observed an OR = 6.83, 95% CI 2.76-16.89, p = 3.26 × 10 -5 for the highest vs. lowest quintile of the unweighted score and OR = 16.77, 95% CI 3.80-74.07, p = 1.99 × 10 -4 for the highest vs. lowest quintile of the weighted score. Results were similar when we considered the whole dataset including 78 cases and 256 controls and "scaled" PRSs (Table 4), as well as when we assembled PRSs with only the 9 SNPs showing association with PA risk in this population (data not shown).

Discussion
The genetic PA risk factors in SNP context inherent to the Brazilian population have not been studied so far. Here we observed 9 SNPs associated with PA risk (p < 0.05) with the most significantly associated being rs3790844, rs9854771, rs2941471, rs401681, rs13303010 and rs9543325. A very important aspect in our results for these SNPs is that for all of them the OR is in the same direction of the original GWAS work.
The first SNP is located at the first intron of NR5A2, with MAF in global population of 25%, and we observed a similar value of 27% in our control group, while in PA patients this value was 18%, returning an OR that represents a protective effect of this allele for PA. In a meta-analysis by Chen et al., this SNP had a protective effect in Caucasians, although not in Asian populations [27]. However, another study with 360 pancreatic cancer patients and 400 controls suggested that this SNP is     related with pancreatic cancer risk in Japanese subjects [27]. A large study using 3851 pancreatic cancer cases and 3934 controls participants from the previously conducted GWAS in the Pancreatic Cancer Cohort Consortium and the Pancreatic Cancer Case Control Consortium (PanC4) [17,28] showed this SNP as the most significant risk factor for pancreatic cancer, with an OR of 0.77, again representing a protective effect of minor allele [29].
The SNP rs9854771 has a MAF in global population of 37%. In our control group we observed a similar MAF of 39%, while in PA cases it was 25%. This SNP is located near TP63 gene, that is a p53 homologue and implicated in tumorigenesis and metastasis [30], and previous GWAS studies have demonstrated significant evidence of association for SNPs in TP63 in lung cancer and bladder cancer [31][32][33][34][35]. The first description of its role in pancreatic cancer was revealed by Childs [25] with an OR of 0.89 and a subsequent study [36] returned a similar result with an OR 0.76.
A third SNP where the minor allele is associated with a reduction in PA risk is rs2941471 and its MAF in global population is 41%. Here, the control group show a MAF of 44%, while in the PA cases it is 35%. This SNP is located is an intronic region of HNF4G gene, at chromosome 8q21.11, which encodes hepatocyte nuclear factor 4 gamma, a transcription factor of the nuclear receptor superfamily whose expression level was increased in five of six clinical human hepatocellular carcinoma samples [37]. When related with pancreas, mice lacking HNF4G have higher numbers of pancreatic β-cells, increased glucose-induced insulin secretion and improved glucose tolerance [38]. A research showing GWAS pathways associated with pancreatic cancer susceptibility factors proposed a link between HNF4G inherited variation for pancreatic development [29]. A very consistent research with 2737 pancreatic cancer patients and 4752 controls also yielded this SNP as a genome-wide significant locus (OR = 0.87) [21].
For SNP rs9543325, the global MAF is 38%, similar the frequency of 44% found in our control Brazilian population. In PA, this value increased to 56% and was associated with increased risk for pancreatic cancer in all models analyzed. This association was previously showed in Europeans [28,39], including Jewish and non-Jewish [40], and in the Taiwanese population [41]. This intergenic SNP maps at 13q22.1 locus, and has been showed to be strongly associated with pancreatic cancer [1,3,36,[40][41][42]. The locus 13q22.1 has other SNPs associated previously with PA, mainly in European and Chinese populations, some studies suggest a potential long-range enhancer activity but mechanisms are still unknown [43].
The SNP rs13303010 has a global population MAF of 12%. In our control group this value was increased to 22% and, among PA patients this frequency increased to 37% and was associated with high cancer risk. The minor allele was also associated with increased PA risk in European [21] and Japanese populations [44]. In European populations, it was highlighted in PA susceptibility only in the largest pancreatic cancer GWAS to date, including 11,537 patients and 17,107 controls from the Pancreatic Cancer Cohort Consortium (PanScan I + II, III), Pancreatic Cancer Case-Control Consortium (PanC4) and PANcreatic Disease ReseArch (PANDoRA) consortium [21]. In the Japanese population, 664 pancreatic cancer cases and 664 controls were analyzed and this SNP was highlighted as PA risk factor [44]. This SNP is mapped at 1p36.33, in the first intron of the NOC2L gene and probably influences the host expression. The presence of the risk-increasing allele was associated with higher NOC2L expression [21] and this gene encodes the NOC2 like nucleolar associated transcriptional repressor, a protein that represents a novel histone deacetylases-independent inhibitor of histone acetyltransferase [45]. NOC2-like protein has also been associated with the inhibition of p53 and p63 tumor suppressor [46,47], notably associated with cancer.
The rs401681 is a SNP located in the intron of CLPTM1L and 27 kb from the TERT gene, being associated with many tumor types [48,49]. The global population MAF is 43% and, in the Brazilian population, we found a similar frequency of 45% in the control group. In the present study, the presence of the minor allele represents a risk factor for pancreatic cancer. This high risk for PA was also shown in European [17,39,50] and Asian populations [51,52]. It is suggested that rs401681 confers cancer susceptibility by regulating CLPTM1L and TERT expression [53], both genes implicated in carcinogenesis. CLPTM1L gene may be associated in apoptosis processes and high expressed in cisplatin-resistant cell lines [54], TERT gene produce catalytic subunit of telomerase, associated with telomere maintaining and usually active in cancer cells [55]. An interesting aspect in rs401681 is that the minor allele is usually associated with increased risk in pancreatic cancer and in melanoma [56] whereas the C allele was associated with increased risk of other tumor types, such as lung, prostate and bladder [48].
Some other SNPs showed statistically significant associations with PA risk in this work. The minor allele of rs17688601, in SUGCT gene, and rs4795218, in HNF1B gene, were associated with reduced risk in the European population [21,25]. In the Brazilian population we found them also associated with protection, but only in allelic model. On the other hand, another SNP previously associated SNP in Europeans, the rs7310409 in HNF1A, was associated with risk in dominant and co-dominant models, but not in allelic analysis (p = 0.065). The other SNPs analyzed were not associated with PA in the Brazilian population in the present study.
Ethnic differences in pancreatic cancer incidence have been reported, especially regarding higher incidence in African in relation with European ancestries [57][58][59]. Some studies suggested that this higher pancreatic cancer incidence in African ancestry may be partially explained by the greater prevalence of smoking, diabetes, and obesity among these group with no genetic investigation [58,60]. A recent report demonstrated that family history of pancreatic cancer, diabetes, body mass index ≥ 30 kg/m 2 , current smoking, and red meat intake were associated with pancreatic cancer. More than that, after adjustment for these risk factors, Native Hawaiians, Japanese Americans, and African Americans but not Latino Americans had a higher risk of pancreatic cancer compared to European Americans, showing the genetic influence in pancreatic cancer incidence [61]. Regarding ethnic ancestry on this report, Brazil represents a heterogeneous country with European, African and Asian descendants. Interestingly, statistically significant difference between MAF in PA patients and controls in Brazilian population was observed just in 3 SNPs in African and 2 in European ancestry. In this context, SNPs rs9854771 and rs9543325 were observed as a higher MAF in African ancestry Brazilian controls than PA African Brazilian patients. On the other hand, SNP rs2941471 represent a lower MAF in Brazilian controls than PA patients for both African and European ancestry, the same trend observed for SNP rs13303010 in European ancestry. But when MAF frequency was compared between Brazilian controls and dbSNP data, six SNPs were statistically significantly different, but all in African ancestry. These data demonstrate that self-reported African ancestry from Brazilian controls presents a different genetic SNP profile when compared to African population, probably due to ethnic miscegenation.
PRSs computed with the SNPs we included in the study show a strong association with PA risk when comparing the 20% of the population with highest and lowest PRS values. The small sample size results in very wide confidence intervals of our risk estimates, but the results are in line with a those of a recent study in a much larger population of European origin [62]. It is expected that smaller groups at the extremes of the PRS distribution (e.g. the 5% or 1% with highest/lowest PRS values) will show even more marked differences in risk.

Conclusion
The main limitation of this study is small sample size. However, as our target SNPs were previously reported as susceptibility loci for PA in large GWAS studies, mainly conducted with European population, this small sample size could establish for the first time SNPs as genetic risk factor for PA in Brazil. Despite a considerable percentage of Amerindian, African and Asian descent in the Brazilian population, the largest ethnic component is European ancestry, showing that genetic risk factors related to Europeans are at least partially reflected in the Brazilian population. This was partially demonstrated by MAF frequency from European ancestry in Brazilian controls and dbSNP database, where no difference was observed. In contrary, Brazilian controls from African ancestry showed MAF statistically significantly different in six out of nine SNPs. Associations of several SNPs reported to affect PA risk in populations of European descent were successfully replicated in our study. Given the limitation of sample size it is not possible to assess whether the SNPs that did not replicate in this work are relevant or not in the Brazilian population. However, it is worth nothing that even for the SNPs that do not reach p < 0.05, the direction of the associations (i.e. whether the minor allele is associated with increase or decrease in risk) was consistent with the GWAS data. Our group is recruiting more PA patients and with this data we will have more power in future analyses.
These data can be used for stratification of PA risk, especially in groups that are already known to be at increased risk, such as people with positive family history of pancreatic cancer, and in subjects with high tobacco and alcohol use. PRS can be particularly useful in this context, as shown by our results. More important, this is the first genetic susceptibility study for pancreatic adenocarcinoma in Brazilian population.

Acknowledgements
We would like to thank Diogo Araújo and Maria José Ferreira Alves for PA samples processing.

Authors' contributions
MNA and FC conceived and designed the study. MNA and AS performed the experiments and data analysis. RC and MU organized and conducted the PA sample collection, as participated in data interpretation and analysis. MNA and JCO conducted controls sample collection, together with data interpretation and analysis. FBAM and AMM made manuscript corrections and substantively revised it. MNA, FC drafted the paper. All authors had substantively intellectual contributions, read and approved the manuscript, and agreed to be personally accountable for the author's own contributions. All authors read and approved the final manuscript.

Funding
This work was supported by Fiocruz and Carlos Chagas Institute, especially grant "Geração de conhecimento -Novos Talentos", and intramural funding of DKFZ, providing financial support for samples collection and processing besides data generation and analysis. Open access was enabled through BMC Medical Genomics waiver.