Germline variants in DNA repair genes associated with hereditary breast and ovarian cancer syndrome: analysis of a 21 gene panel in the Brazilian population

Background The Hereditary Breast and Ovarian Cancer Syndrome (HBOC) occurs in families with a history of breast/ovarian cancer, presenting an autosomal dominant inheritance pattern. BRCA1 and BRCA2 are high penetrance genes associated with an increased risk of up to 20-fold for breast and ovarian cancer. However, only 20–30% of HBOC cases present pathogenic variants in those genes, and other DNA repair genes have emerged as increasing the risk for HBOC. In Brazil, variants in ATM, ATR, CHEK2, MLH1, MSH2, MSH6, POLQ, PTEN, and TP53 genes have been reported in up to 7.35% of the studied cases. Here we screened and characterized variants in 21 DNA repair genes in HBOC patients. Methods We systematically analyzed 708 amplicons encompassing the coding and flanking regions of 21 genes related to DNA repair pathways (ABRAXAS1, ATM, ATR, BARD1, BRCA1, BRCA2, BRIP1, CDH1, CHEK2, MLH1, MRE11, MSH2, MSH6, NBN, PALB2, PMS2, PTEN, RAD50, RAD51, TP53 and UIMC1). A total of 95 individuals with HBOC syndrome clinical suspicion in Southeast Brazil were sequenced, and 25 samples were evaluated for insertions/deletions in BRCA1/BRCA2 genes. Identified variants were assessed in terms of population allele frequency and their functional effects were predicted through in silico algorithms. Results We identified 80 variants in 19 genes. About 23.4% of the patients presented pathogenic variants in BRCA1, BRCA2 and TP53, a frequency higher than that identified among previous studies in Brazil. We identified a novel variant in ATR, which was predicted as pathogenic by in silico tools. The association analysis revealed 13 missense variants in ABRAXAS1, BARD1, BRCA2, CHEK2, CDH1, MLH1, PALB2, and PMS2 genes, as significantly associated with increased risk to HBOC, and the patients carrying those variants did not present large insertions or deletions in BRCA1/BRCA2 genes. Conclusions This study embodies the third report of a multi-gene analysis in the Brazilian population, and addresses the first report of many germline variants associated with HBOC in Brazil. Although further functional analyses are necessary to better characterize the contribution of those variants to the phenotype, these findings would improve the risk estimation and clinical follow-up of patients with HBOC clinical suspicion.


Background
Hereditary Breast and Ovarian Cancer (HBOC) Syndrome occurs in families with a history of certain cancers, particularly breast and ovarian cancers with an autosomal dominant inheritance pattern. It encompasses about 5-10% of all breast cancer (BC) cases and up to 80% of all ovarian cancers (OC) [1,2], and the affected families present a 50-80% increase in lifetime risk to BC and 30-50% to OC [3]. The National Comprehensive Cancer Network (NCCN) [4] is an alliance that creates the guidelines used for detection, prevention, as well as for adoption of strategies for risk reduction for HBOC affected families. According to NCCN, the main criteria used for further genetic risk evaluation in HBOC patients are: patients diagnosed with BC before 45 years or with invasive OC at any age, personal or familial recurrence of BC or OC, bilateral BC, and presence of male BC. Furthermore, patients at risk of HBOC may also present pancreatic and prostate cancers [4]. In this way, in order to help demystifying the association of HBOC with BC and OC risk in women [5], it has recently been proposed to change the name of HBOC to King Syndrome, in honor of Mary-Claire King who first described the locus associated with hereditary breast and ovarian cancers risk [6].
During the 1990's, germline variants in the breast cancer susceptibility genes BRCA1 and BRCA2 were first described as showing increased risk for HBOC [7,8]. Variants in BRCA1 are associated with earlier-onset BC (30-50 years), when compared to BRCA2 variants that increase the BC risk mainly for individuals of 40-60 years old [9]. The BC and OC risk rates also vary between BRCA1 and BRCA2 genes, with BRCA1 carriers presenting a risk of up to 57% for BC and 40% for OC, while for BRCA2 carriers the risk is slightly lower, 49 and 18% for BC and OC, respectively [10].
Molecular diagnosis is a very important step on the clinical management of HBOC patients since it allows for the family risk assessment, mortality reduction as well as allowing for the adoption of prophylactic measures, such as preventive mastectomy and/or oophorectomy, reducing the cancer risk by up to 95% in BRCA1/ BRCA2 carriers [11][12][13]. However, despite the high penetrance and the high frequency of variants found in BRCA1/BRCA2 genes, only about 20% of hereditary BC and OC have been attributed to the presence of pathogenic variants in those genes, moreover, about 5-10% have been associated with other susceptibility genes, such as TP53, STK11, PTEN, ATM, and CHEK2 [14]. Studies have demonstrated molecular diagnosis rates of about 4.6-54% when only BRCA1/BRCA2 are screened, which evidences the association of other less penetrant genes with HBOC pathogenesis [15][16][17][18]. Even though the protocols for clinical management are well established for BRCA1/BRCA2 carriers, patients tested negative for pathogenic BRCA1/BRCA2 variants lack the proper clinical follow-up and genetic counselling when presenting similar clinical characteristics and BC/OC increased risk [19]. This reinforces the need of not only description but also the characterization of other genes associated with HBOC risk.
With the popularization of next-generation sequencing technologies (NGS), genes encoding proteins that work in the homologous recombination DNA repair pathway (HR), as well as mismatch repair (MMR) pathway, have been frequently reported as mutated in hereditary BC and OC cases [14,16,[20][21][22][23][24][25][26]. Most genes are not only frequently mutated but they have also been considered by NCCN guidelines in the clinical management of patients at risk since they are associated with a high to moderate penetrance of BC and OC [4].
However, in the Brazilian population, besides BRCA1 and BRCA2, the characterization of other DNA repair genes related to HBOC susceptibility is still in its infancy. The main available data encompasses the screening of hotspot variants and microdeletions in CHEK2, PTEN, POLQ and TP53 genes [2,[27][28][29][30], and to date, only two studies using NGS technology are available in Brazil. Recently, the screening of the whole exome in Brazilian patients negative for BRCA1/BRCA2 pathogenic variants revealed other genes, such as ATM and BARD1, carrying pathogenic variants [26]. Another study using multi-gene screening showed a prevalence of 9.8% of patients carrying BRCA1/BRCA2 pathogenic variants and 4.5% carrying pathogenic variants in ATR, CDH1, MLH1 and MSH6 genes [24].
In this study, we screened 95 samples of patients with HBOC syndrome clinical suspicion, using a multi-gene panel sequencing both flanking and coding regions of BRCA1, BRCA2 and another 19 DNA repair genes. Also, 25 samples were tested for BRCA1/BRCA2 copy number variations (CNVs). The molecular screening was performed to identify causal germline variants and characterize variants of unknown/uncertain significance (VUS) in order to improve the molecular diagnosis. Our data report a global analysis of 21 DNA repair genes to the HBOC etiology, which are contributing to the epidemiology of HBOC in Brazil.

Patient samples and clinical data
The individuals evaluated were referred to the Cancer Genetics Counseling Service of the University Hospital of the Ribeirão Preto Medical School of the University of São Paulo (HCFMRP-USP, Ribeirão Preto -Brazil) for cancer risk assessment from 2008 to 2016. A total of 95 unrelated subjects were eligible for further investigation. These individuals had a clinical suspicion of HBOC Syndrome, and presented criteria for genetic risk evaluation according to the NCCN Clinical Practice Guidelines in Oncology v.2.2015 [4], and presented a cumulative risk of BRCA1 and BRCA2 variants higher than 10%, using PennII model (https://pennmodel2. pmacs.upenn.edu/penn2/), and a personal history of cancer.
The clinical and pathologic data was abstracted from medical records of the HCFMRP-USP and included personal and family cancer histories, cancer histology, stage, and receptor status. The College of American Pathologists (CAP) guidelines were used to define progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) positivity, but for estrogen receptors we used the 10% threshold for positivity [31].
Samples of 28 elderly people (over 70 years old) negative for personal history of cancer, were used as control group and had their whole exome sequenced by the Molecular Genetics Laboratory of UNICAMP (Campinas, SP), headed by Dr. Iscia Lopes Cendes, who kindly provided the results. We believe that older people with no personal cancer history constitute a proper control for hereditary cancer studies once those people over the age of developing hereditary cancer and reached old age free of this disease. Therefore, if any variants are found in both HBOC and elderly cohorts, we discourage further associations with breast and ovary cancer risk.
Genomic DNA of both HBOC and elderly cohorts were extracted from whole blood using the Wizard® Genomic DNA Purification Kit (Promega, Madison, WI). The samples were part of the Center for Medical Genomics Biorepository (HCFMRP-USP) and were used for these analyses only after approval by the Ethics Research Committee of the HCFMRP-USP (n. 2819/2016).
The genetic test results from this analysis were returned to study participants, helping the clinical decision when suitable.
In order to prioritize a smaller number of variants for further characterization, we refined the whole set of variants in favor of remaining with those classified as pathogenic according to ACMG/AMP consensus, as well as remaining with all the VUS and benign variants (according to VarSome and ClinVar) which presented both in coding and splicing regions, if they were predicted as damaging/pathogenic by the in silico prediction tools. We decided to maintain the benign variants in this set of prioritized variants in order to avoid disregarding variants of potential effect to the phenotype, since ClinVar and VarSome classifications are not always supported by strong evidences (segregational and functional data).
Thereafter, at times we refer to those variants as presenting conflicting data on pathogenicity.

Sanger Sequencing Validation
All samples that presented pathogenic variants, as well as all those significantly associated with relative risk to HBOC were submitted to Sanger sequencing. Briefly, 100 ng of whole blood DNA from individuals carrying those variants was submitted to PCR amplification performed with Taq DNA polymerase (Promega, Madison, WI). The amplification products were sequenced in both directions using BigDye Terminator v3.1 (Life Technologies, Carlsbad, CA) and specific primers for each region, in the ABI 3500XL Genetic Analyzer (Life Technologies, Carlsbad, CA), according to manufacturer's instructions. Sequencing data were analyzed with the Geneious R7 software v7.1 using the GRCh37/hg19 sequence as reference. Primer sequences are available under request.

Analysis of CNVs in BRCA1 and BRCA2 genes
To exclude the presence of large insertions/deletions in BRCA1/BRCA2 genes that might not have been detected by NGS, we performed the Multiplex Ligationdependent Probe Amplification (MLPA) analysis for patients who did not present any variants on BRCA1/ BRCA2 (n = 12) after the multi-gene panel screening, as well as for those patients carrying variants that were significantly associated with relative risk to HBOC (n = 15). In order to achieve this, we used the P087-BRCA1 and P090-BRCA2 kits (MRC-Holand, Amsterdam, NH), according to the manufacturer's recommendations. Briefly, the DNA from HBOC patients and control samples were pre-heated to 98°C, and then the salt solution and probe mix were added to the DNA. After the ligation of annealed nucleotides, the targeted genes were amplified using polymerase chain reaction (PCR). PCR products were separated using the ABI3500XL Genetic Analyzer (Applied Biosystems, Foster City, CA), and the fragments were analysed using the Coffalyser software v.140701.0000 (MRC-Holand, Amsterdam, NH).
Screening for the c.156_157insAlu variant in BRCA2 All 95 HBOC samples were screened for the variant c.156_157insAlu in the BRCA2 gene, which was not detected by the multi-gene panel analysis. We performed two rounds of PCR: a first PCR reaction for BRCA2 exon 3 amplification (forward primer: GTCACTGGTTAAAA CTAAGGTGGGA and reverse primer: GAAGCCAGCT GATTATAAGATGGTT), and a second PCR specific for Alu fragment amplification (forward primer: GACACC ATCCCGGCTGAAA, reverse primer: CCCCAGTCTA CCATATTGCAT). The cycling conditions were 94°C for 3 min, 35 cycles at 94°C for 1 min, 52°C for 1 min, and 72°C for 4 min, and a final extension of 72°C for 10 min. For the sample that presented a fragment amplification bigger than that expected for BRCA2 exon 3 amplification (around 200pb), the specific Alu PCR was performed using the same cycling conditions applied for BRCA2 exon 3 amplification. The PCR product was then sequenced in both directions using BigDye Terminator v3.1 (Life Technologies, Carlsbad, CA,) and Alu specific primers in the ABI 3500XL Genetic Analyzer (Life Technologies, Carlsbad, CA), according to manufacturer's instructions.

Haplotype analysis for high frequency BRCA1 benign variants
We performed a haplotype analysis in order to assess if five high frequency BRCA1 variants (c.*421G > T, p.Pro871Leu, p.Glu1038Gly, p.Lys1183Arg, and p.Ser1613Gly) were segregating together and were associated with HBOC risk. Based on previous results of our group, which also found these BRCA1 variants presenting a high frequency in a small HBOC cohort (n = 25, unpublished data), we joined the two HBOC cohorts (n = 94 sequenced in this study, and n = 25 samples previously screened for those variants, totalizing a final n = 119) and also genotyped 108 additional elderly samples for the five BRCA1 SNVs (n = 28 sequenced in this study, and n = 108 additional elderly samples, totalizing a final n = 136) to perform a more accurate statistical analysis.
Additionally, in order to assess the frequency of those five BRCA1 SNVs in other Brazilian populations, we genotyped 94  We applied a TaqMan Allele Discrimination assay (Applied Biosystems, Foster City, CA), using designed probes and primers specific to each BRCA1 variant: c.*421G > T (assay ID: AHX1AK8), p.Pro871Leu (assay ID: C___2287943_10), p.Glu1038Gly (assay ID: C_ 2287888_10), p.Lys1183Arg (C___2287889_20), and p.Ser1613Gly (assay ID: C_2615208_20). For each reaction, we used 2 μL of each sample at 5 ng/μL, 5 μL of TaqMan master mix (Applied Biosystems, Foster City, CA), and 0.25 μL (200 nM) of each probe, reaching a final volume of 10 μL, placed in 96-well PCR plates. The cycling conditions were 95°C for 10 min, 40 cycles at 92°C for 15 s and 60°C for 1 min, and 60°C for 1 min, and a final extension at 72°C for 10 min. The amplification was performed using the 7500 Real-Time PCR Systems (Applied Biosystems, Foster City, CA) and the results were analysed using the manufacturer's software.
Subsequently, we assessed the haplotype frequency estimation for all samples using the haplo.stats package version 1.7.9 (https://cran.r-project.org/web/packages/ haplo.stats/index.html), on R environment (RStudio, version 1.2.1335). The haplo.stats analysis also estimates the association among haplotypes and the disease, considering p value <0.05 as statically significant.

Risk association analysis and statistical tests
For the risk association analysis we used the allele frequencies found in our HBOC cohort, compared to the allele frequencies of the same variants available in the AbraOM public database which includes the exome sequencing data of 609 elderly Brazilians [42]. We decided to use public databases instead of the allele frequencies on the elderly samples due to low number of individuals sequenced. When the allele frequencies on AbraOM were zero, we used the European non-Finnish, Latin, American, African and frequencies available on 1000 Genomes [43] or ExAC [44] databases. We performed an odds ratio (OR) analysis applying the Fisher's exact test. The p-values were assessed using the Pearson's X 2 test.
For assessing the clinical and molecular associations, we applied Pearson's X 2 test.
For these two analyses we used the R commander [45] tools on R environment (RStudio, version 1.2.1335) and considered results as statistically significant at a p-value of 0.05 or less.
For the survival (Kaplan Meier) analysis, we used Logrank test for trend and Mantel-Cox, as recommended by GraphPad Prism 8.1.2. We also assessed the results for the Gehan-Breslow-Wilcoxon test.

Patients clinical characterization
Most of patients (n = 84) were diagnosed with breast cancer, showing a prevalence of 82.4% (n = 80) of Invasive Ductal Carcinoma (IDC) (Additional file 1: Table  S1). The Luminal and Triple-negative (TN) were the most frequent molecular subtypes, presenting a frequency of 33.3 and 28.6% of BC cases, respectively. In general, most of the patients (n = 65) presented tumors of intermediate to high grades (2 and 3), independently to the age of diagnosis. Only six patients (6.3%) were diagnosed with ovarian cancer, of which half of cases were serous ovarian cancer (Table 1, and Additional files 1: Table S1). One patient presented with diffuse gastric cancer (the only man in our cohort) and another, endometrial adenocarcinoma, and both presented with a strong history of breast and ovarian cancers in their families. Only one case presented with both asynchronous BC and OC. Most of the cases (85.3%) were diagnosed between 22 and 49 years, and 13.6% (n = 13) deceased due to distant metastasis occurrence (Table 1).

Multi-gene panel screening
We identified 667 single nucleotide variants (SNVs) and small insertions/deletions in 94 out of 95 samples screened for variants in their coding and flanking regions of 21 DNA repair genes. One sample was excluded due to a general low quality in the base calling. We then prioritized variants filtering it according to the following criteria: 1 -Variants classified as pathogenic according to ACMG/AMP consensus, and 2 -VUS and benign variants present both in coding and splicing regions, and predicted as damaging/pathogenic by the in silico prediction tools. This filtering aimed to select the possible candidate variants without losing variants of unknown significance (VUS), which were not yet characterized but may exert some effect to the phenotype. We selected 82 variants in 19 genes with RAD50 and PTEN presenting no possible candidate variants ( Table 2). Considering these prioritized variants, about 81% of the patients presented variants in BRCA1 gene, although genes such as ABRAXAS1, ATM, BRCA2 and UIMC1 also emerged as presenting a high frequency of variants in our cohort. Only 3% of the prioritized variants are described in the breast (TP53 and MLH1 variants) and ovarian cancer (BRCA2 variant) samples of The Cancer Genome Atlas database (TCGA) (https://www.cbioportal.org/), which is expected once the publicly available data on TCGA comprises solely somatic variants. Figure 1 shows the most prevalent variants detected in the studied samples. About 11.2% (n = 9) were frameshift, stop gain, insertion or missense variants, previously described as pathogenic in BRCA1, BRCA2 and TP53 genes, with a prevalence of 23.4% (n = 22). The most prevalent pathogenic variant was the frameshift p.Gln1756Profs*74 (c.5266dupC) in BRCA1 (ENSP00000350283.3) gene, present in half of the cases which exhibited BRCA1 mutations (n = 11), followed by the variant p.Arg337His (c.1010G > A) in TP53 (ENST00000269305.8), found in another 5 patients. Our results also introduce the first report of two known pathogenic variants in the Brazilian population: the p.Tyr3009Serfs*7 (c.9026_9030delATCAT) on BRCA2, and p.Arg273His In regard to BRCA1 and BRCA2 genes, we also identified five benign variants in the BRCA1 gene presenting a high frequency in our HBOC cohort: the 3'UTR c.*421G > T, p.Pro871Leu (c.2612C > T), p.Glu1038Gly (c.3113A > G), p.Lys1183Arg (c.3548A > G), and p.Ser1613Gly (c.4900A > G). Based on previous results of our group which also found those variants in a high frequency in a small HBOC cohort (unpublished data), we sought to investigate whether those variants were segregating together and if they were associated with an increased HBOC risk. Haplotype analysis by Haplo.-Stats program identified 5 haplotypes with frequencies above 1% (Table 3). Haplotype 2, with all five SNVs, was the second most frequent haplotype found (24.8%) in our study.     However, this haplotype was significantly more frequent in the elderly cohort (p = 0.020), and was not associated with an increased HBOC risk. To further investigate if there is any correlation between BRCA1 haplotypes and HBOC risk, we performed the haplotype analysis using HBOC and control samples from another three cancer centers in Brazil: Porto Alegre Clinical Hospital (HPOA), A.C. Camargo Cancer Center (ACC) and Barretos Cancer Hospital (HCB). Haplotype analysis results were similar for all three centers. The Haplotype 2 (Table 3) were not significant in the other three centers (Haplotype in red, Additional file 2: Table   Table S2), but also showed a higher frequency in the control group, suggesting no correlation with an increased risk of HBOC Syndrome. Once both variants and haplotypes were present in the elderly and other control samples, we suggest despite segregating together, those variants may merely constitute part of a polymorphic region and are not associated with hereditary cancer risk. About 12.8% (n = 12) of the patients did not present any variants in the BRCA1/BRCA2 genes (Fig. 1, and Additional file 1: Table S1). Most cases (76.6%) presented missense VUS or benign missense variants according to VarSome and ClinVar, which were qualified as being pathogenic by the in silico prediction tools, which may unable the clinical interpretation and risk estimation during the genetic counselling for carriers. The association study with these variants identified 8 genes carrying 13 variants as significantly associated with an increased risk to HBOC when compared to the allele frequencies described in public databases. Genes such as BARD1, CHEK2, PALB2 and PMS2 presented more than one variant associated with risk (Fig. 2).
The prevalence of variants associated with HBOC was about 16% (n = 15), and most of them (n = 13) were present in double heterozygosis variants with conflicting data on pathogenicity in BRCA1/BRCA2. BARD1, CHEK2, PALB2 and PMS2 presented more than one variant associated with risk (Fig. 3), and the variant p.Ala617Thr (c.1849G > A) in CDH1 gene presented the highest allele frequency (AF = 0.01595745). One patient presented a pathogenic variant in BRCA1 in double heterozygosity with one BARD1 prioritized variant (Fig. 1, and Table 2).
All patients carrying variants associated with an increased risk, as well those who did not present any BRCA1/BRCA2 variants tested negative for BRCA1/ BRCA2 CNVs.
As expected, in the elderly cohort we identified only a small number of coding variants classified as pathogenic or of uncertain significance (VarSome and ClinVar), when looking at the 21 genes screened in our HBOC cohort (Fig. 4). However, none of the variants described in the HBOC patients were found in the elderly samples used as control. Despite the small sample size available for the elderly cohort, our data confirms that cohort constitute a proper control in hereditary cancer studies.

Clinical characteristics of germline variants-carriers
The prevalence of pathogenic variants in BRCA1 and BRCA2 was about 18% (n = 17), with only four patients presenting BRCA2 pathogenic variants. We observed that 90% of carriers of BRCA1 pathogenic variants presented with high grade tumors (grade 3) while about 80% of BRCA2 carriers presented with tumors with grades I and II. Additionally, most of BRCA1-variant carriers were diagnosed with triple negative BC (Fig. 1). The non-BRCA1/BRCA2 group also presented high frequency of intermediate to high grades tumors (grades 2 and 3) (Fig. 1, Table 1), which may suggest that other genes are associated with moderately-poorly differentiated tumors as is known for BRCA1/BRCA2-carriers [50]. The presence of metastasis was strongly correlated with death (p = 7.85e-12) since 13 out of 14 patients that died presented distant metastasis. We did not find any association between tumor clinical staging and the genotypes.
A total of 12 individuals (12.8%) did not present any variants or CNVs in BRCA1/BRCA2 and were grouped as non-BRCA1/BRCA2 patients. This group presented variants in ABRAXAS1, ATM, ATR, BARD1, CDH1, MLH1, MSH6, PMS2, TP53 and UIMC1 genes. All non-BRCA1/BRCA2 patients were BC cases, showing a median age at diagnosis of 36.5 years and a median survival of 8 years (Table 1). However, we did not observe any association with death with the genotype of the patients. Surprisingly, the patients that presented pathogenic variants in BRCA1/BRCA2 showed a trend towards better survival with most of cases that died being the ones that presented VUS, benign or no variants in BRCA1/BRCA2 genes (Fig. 5).

Discussion
Genes such as BRCA1, BRCA2 and TP53 presented pathogenic variants in 23.4% (n = 22) of the investigated cases. The only study with a multi-gene analysis in Brazil has shown genes such as BRCA1, BRCA2, ATM, ATR, MLH1, MSH2 and MSH6 carrying pathogenic variants but with a much lower frequency (9.5%) [24].
The most prevalent variant was the frameshift p.Gln1756Profs*74 (c.5266dupC) in BRCA1, identified in 11.7% of patients. This variant was also described in the study of Timoteo et al. (2018) [24], but with a frequency of only 3%. This variant is commonly found in South American populations, being well described in Brazil, especially in ovarian cancer cases [51,52], although it was found only in breast cancer cases in our HBOC cohort. It is a founder Ashkenazi Jewish variant and it is very common among North European populations [53]. This may explain the high frequency found in the Southeast of Brazil, which is marked by a strong European ancestry [54].
Four patients presented the following variants in BRCA2 genes: p.Ala938Profs*21; p.Tyr3009Serfs*7; p.Arg3128Ter and, the third most common variant within Brazilian population, the c.156_157insAlu. The Alu retroelements are fragments of approximately 300 nucleotides that are reported as being inserted in many genes such as BRCA1 and BRCA2 and are related to an increased cancer risk [55,56]. The Alu insertion in BRCA2 exon 3 was first reported by Teugels et al. (2005) [57] as a Portuguese founder variant in HBOC patients, and due to the Portuguese immigration during the Brazilian colonization, this variant is frequently found in Brazilian populations [55]. The Fig. 2 Association analysis of 72 prioritized variants with conflicting data on pathogenicity to HBOC risk. The risk association analyses were performed comparing the allele frequencies identified in our HBOC cohort to frequencies found in public databases (*) AbraOM, ExAC and 1000 Genomes. In ClinVar status ($), B = Benign; LB = Likely Benign; US = Uncertain Significance; P = Pathogenic; Conflicting = when presenting conflicting interpretations of pathogenicity. The association was made using Fisher's exact test, and the p-values were assessed using the Pearson's X 2 test. The lack of allele frequencies in the databases made us unable to estimate the odds ratios (OR). The variants in red are those significantly associated with HBOC risk. NA = Not available (allele frequencies not reported by any populational database, or when was not possible to calculate the p-value due to the lack of allele frequency in the populational databases) pathogenicity of this insertion is attributed to the exon 3 skipping, which causes the loss of the PALB2 and RAD51 binding region, essential to homologous recombination repair [48].
Five patients also presented the pathogenic variant p.Arg337His in TP53 gene. This is a founder variant of South Brazil, known as segregating in families with sarcomas, adrenocortical and choroid plexus carcinomas, and breast cancer at early onset [30,58]. It is located in the oligomerization domain of p53 and as well as the segregation studies, it has been shown that this variant is associated with a decreased Fig. 3 Schematic representation of BARD1, CHK2, PALB2 and PMS2 proteins and the variants associated with increased risk to HBOC. a Linear representation of BARD1 protein depicting the RING, Ankyrin (ANK), and BRCT domain boundaries [46], and the three variants found in that gene; (b) CHK2 depicting the SQ/TQ cluster domain (SCD), forkhead-associated domain (FHA), and the kinase domain (KD) [47], showing the localization of the two variants identified in that gene; (c) PALB2 protein with its main domains depicted: coiled coil, ChAM, MRG15-binding domain I and II (MBD I and II), WD40 repeats domain, and the nuclear export signal (NES) [48], showing the variants found as significantly associated HBOC risk; and (d) PMS2 with its ATP and MLH1 binding domains, and its endonuclease domain [49], depicting the variants identified in that gene. The graphs were built using the lolliplot function of the GenVisR package, on R environment (RStudio, version 1.2.1335), and were adapted by the authors oligomerization and transcriptional activities of p53 [59,60].
However about 76.6% of the cases presented VUS and variants with conflicting data on pathogenicity in BRCA1/BRCA2 as well as in other investigated genes based on data from VarSome, ClinVar or pathogenicity tools herein employed. In this group we found one patient carrying the previously undescribed variant p.Pro932Thr (c.2794C > A) in ATR gene, which is predicted as pathogenic/possibly pathogenic by all in silico tools used in this study. This patient also presented variants in other genes such as BRCA1, UIMC1 and MLH1, For those cases who did not present any pathogenic variant we observed a high frequency of the five BRCA1 benign variants: the 3'UTR c.*421G > T, p.Pro871Leu (c.2612C > T), p.Glu1038Gly (c.3113A > G), p.Lys1183Arg (c.3548A > G) and p.Ser1613Gly (c.4900A > G). As shown in Table 3, these variants were segregating together, and constituted the second most frequent haplotype found in this study. Despite this, the haplotype containing the five SNVs was significantly more frequent in elderly cohort (29.2%) when compared to HBOC cases (19.9%) (p = 0.020), which suggests that these variants are not associated with an increased risk to HBOC. Indeed, four of these variants were previously described as presenting a high frequency in a healthy cohort in an ethnic dependent manner, with p.Pro871Leu presenting high African and European ancestry, and p.Glu1038Gly, p.Lys1183Arg, and p.Ser1613Gly, associated with the Central Asiatic ethnic component [61]. It may explain the high frequency of these variants in the studied population.
The genes ABRAXAS1, UIMC1 and ATM also presented a high frequency of missense variants in our HBOC cohort. About 66% of the patients carry the variant p.Ala348Thr (c.1042G > A) in ABRAXAS1, which is not characterized by ClinVar but is predicted as pathogenic by 3 in silico tools. The allele frequency for this variant was 0.4 in our cohort, and population databases describe p.Ala348Thr with a MAF = 0.34 in Brazil [42] and MAF = 0.42 worldwide [62], which corroborates the ACMG/AMP classification of p.Ala348Thr as a benign variant. The p.Pro435Leu (c.1304C > T) in UIMC1 is another VUS not described on ClinVar that presented a high allele frequency (0.10) in our HBOC cases. It also has a high MAF in the population databases (0.12 [42] and 0. 24 [62]). Together with Abraxas, RAP80 is part of the BRCA1-A complex which is important for recruiting BRCA1 to double-strand break (DSB) sites [63] and studies have shown that truncating variants in both proteins are associated with increased irradiation sensitivity, deficient BRCA1 recruitment to DSB sites and genomic instability [64][65][66][67]. Three patients that carried only these two variants were evaluated for BRCA1/BRCA2 CNVs and all tested negative. Due to their high allele frequency, these variants are classified as benign by the ACMG/AMP, however, a more accurate characterization is mandatory to address a clinical significance for these variants, since both are not characterized yet and we cannot discard its contribution to risk following a polygenic inheritance pattern, for example.
Another gene that presented high frequency of variants was ATM (Fig. 1). About 16.8% out of the patients that presented variants in ATM carried the variant p.Asp1853Asn (c.5557G > A), characterized as benign by ClinVar and VarSome. Studies with this variant have Conflicting data on pathogenicity refers to VUS and benign variants that were predicted as pathogenic by the in silico tools. BRCA1/BRCA2 pathogenic n = 17, BRCA1/BRCA2 benign and with conflicting data on pathogenicity n = 65, non-BRCA1/BRCA2 n = 12. We did not find any significant difference between the genotypes (Logrank test for trend, p = 0.3439) shown that it is not associated with an increased risk to HBOC [68].
We also observed a high frequency of missense variants in MMR genes, especially for PMS2 and MSH2 which were mutated in 19 and 10% of the cases, respectively (Fig. 1). Despite truncating variants in those genes being the cause of Lynch Syndrome (LS), it is common to find an overlap between HBOC and LS cases since both syndromes are well known for predisposition to BC and OC [69]. Many studies have reported MMR genes as being associated with an increased risk to HBOC [70][71][72] and indeed, they have been taken into account by NCCN guidelines for the clinical management of patients at risk of hereditary BC and OC [4,73].
However, most patients (76.6%) carry missense VUS or variants presenting conflicting data on pathogenicity. The association analysis based on Brazilian [42] and worldwide public databases [62] revealed 13 variants in ABRAXAS1, BARD1, CDH1, CHEK2, MLH1, PALB2 and PMS2 genes associated with HBOC, with a prevalence of 15.9% (Fig. 2). The variant p.Ala617Thr (c.1849G > A) in CDH1 gene was the most frequent among the studied cases. Differently to the other genes, CDH1 encodes the adhesion protein E-cadherin and variants in this gene are associated with defects in cell adhesion, an increase in the invasive activity and, consequently, metastasis [74]. CDH1 truncating variants are associated with risk to gastric diffuse cancer and in fact, one patient presented familial history of gastric cancer, however, all three cases presented BC or fulfilled NCCN criteria for HBOC risk. This variant has been previously described in the Brazilian population as pathogenic [24,75] but functional assays with cells expressing the mutated protein have shown wild type morphology and normal proliferation and migration activities [76], which suggests this variant may not lead to protein truncation.
The BARD1 was the gene that presented more variants associated with HBOC risk. BARD1 form heterodimers with BRCA1 playing an important role as both E3 ubiquitin ligase as homologous repair mediators by recruiting RAD51 to DSB sites [77].
Variants in this genes have been associated with a deficiency in HR and increased sensitivity to DNA damage, classifying BARD1 as a gene of moderate penetrance to BC and OC [23,[77][78][79]. All three associated variants are described as VUS on ClinVar, but p.Asn255Ser (c.764A > G) and p.Lys423Arg (c.1268A > G) lack studies characterizing their effects on protein functions. Indeed, this is the first study reporting both variants in a HBOC cohort from Brazil. The third variant p.Leu239Gln (c.716 T > A) has been described in the North American population and was also characterized as a VUS [80]. Despite being predicted as likely benign by VarSome, p.Leu239Gln and p.Asn255Ser are predicted as pathogenic by 2 out of 6 in silico tools and are located between the RING and ANK BARD1 domains (Fig. 3a). RING is the region of BRCA1 binding and it is important for heterodimers formation [81]. p.Leu239Gln was found in double heterozygosis with the pathogenic variant p.Trp1836Ter in BRCA1, but p.Asn255Ser was identified in a non-BRCA1/BRCA2 BC patient. Regarding p.Lys423Arg variant, it is located in ANK domain which plays an important role in apoptosis activation due to p53 binding [82]. Despite ANK not being related to the DNA repair process, the evaluation of variants located between amino acids 460-560 have shown an HR deficiency demonstrating that this domain is also important to a correct DNA repair [77]. In fact, three in silico tools classified this variant as pathogenic, however, only functional or segregation analyses are required to confirm the suggested pathogenic effect of those variants.
The role of BRCA1/BRCA2 genes in the HBOC pathogenesis is already well characterized. The VUS p.Met2775Arg (c.8324 T > G) in BRCA2 was identified in one BC patient in double heterozygosis with other associated variants such as p.Arg137Gln in CHEK2 and p.Val717Met in PMS2. p.Met2775Arg has been described in prostate cancer cases and is characterized as possibly pathogenic by 4 in silico prediction tools despite this variant not affecting conversed residue [83,84]. It is located in the C-terminal of BRCA2 proteins, which is important for single strand DNA binding as well as for delivering RAD51 molecules to DSB sites, allowing for a correct homologous recombination repair [85]. It indicated that the integrity of this region is essential for a correct HR. Taking into account that this patient presented three other variants significantly associated with HBOC, we suggest this genotype may have an additive effect on breast cancer risk in this case.
CHEK2 gene also presented two variants associated with risk (Fig. 3b). Chk2 plays an important role in signalling the DNA damage through phosphorylating effector proteins such as BRCA1 [86]. Both variants p.Arg137Gln and p.Ile160Met are located in the FHA domain (Fig. 3b), which after Chk2 phosphorylation and KD domain activation, binds to SCD domains of other Chk2 activated protein, forming dimers that convert into active monomers, signalling the DNA damage [87]. p.Arg137Gln and p.Ile160Met are predicted as pathogenic/possibly pathogenic by two and four in silico tools, respectively. However, functional analyses have shown that p.Arg137Gln is not associated with protein instability and HR deficiency [88][89][90] which corroborates with its probable benign classification by VarSome and Clin-Var. On the other hand, p.Ile160Met is a VUS that has been related to a moderate HR deficiency [91], and in fact, carriers of p.Ile160Met variant presented a worse clinical condition, presenting bilateral BC and death after pulmonary, bone and hepatic metastases in this study. Due to the localization and the clinical features, we suggest that p.Ile160Met may play a role in the risk of HBOC.
Besides presenting the most frequent variant found in this HBOC cohort, ABRAXAS1 also presented the p.Arg163Ser (c.489G > T) variant as being significantly associated with HBOC relative risk (Fig. 2). It is a VUS according to VarSome, which is not described by ClinVar but is characterized as pathogenic by 5 out of 6 prediction tools. p.Arg163Ser is located in the Pad1 domain in the N-terminal region of ABRAXAS, an important RAP80 and other signalling proteins binding domain [92]. Both proteins are mandatory for BRCA1 recruitment to DSB sites and variants affecting that region of ABRAXAS may affect the correct DSBs signalling [64,93].
The synonymous variant p.Glu102Glu (c.306G > A) in MLH1 is predicted as likely benign by VarSome, and is characterized as VUS by ClinVar but was associated with HBOC risk (Fig. 2). It affects a splicing region in the end of MLH1 exon 3. Due to this, p.Glu102Glu is predicted as pathogenic by all in silico tools that return pathogenicity scores for synonymous variants (CADD, UMD predictor and mutation taster). This variant is also described in BC samples of TCGA. Although the publicly available data on TCGA comprises solely somatic variants, it may corroborate the association with increased HBOC risk. The patient carrying this variant was a BC case who also presented other benign variants in MLH1 and BRCA1, a VUS in UIMC1, as well as the novel variant p.Pro932Thr in ATR. As previously described, truncating variants on MMR proteins are known for increasing the risk for both BC and OC [70][71][72]. However, there is no further evidences of the deleteriousness of this variant.
Regarding PALB2 gene, two N-terminal variants were found to be associated with HBOC risk. Despite PALB2 biallelic mutations being associated with Fanconi Anemia, heterozygous variants are known to confer a moderate risk to BC [48,94]. According to VarSome, p.Arg18Lys (c.53A > G) is a VUS which also presents conflicting interpretations of pathogenicity by ClinVar, and is predicted as pathogenic by 3 in silico tools. It is located in the PALB2 coiled coil domain (Fig. 3c), the BRCA1 binding region, but studies have shown that this variant does not affect the PALB2-BRCA1 interaction although it promotes a reduction on HR activity [95]. This variant was found in two BC patients, with one case being a triple-negative subtype (TNBC) ( Table 2, and Additional file 1: Table S1). The p.Thr317Pro (c.949A > C) is a VUS identified in a TNBC case which presented lymph nodes metastasis. It is located near the DBD domain, which important for PALB2 DNA binding [48] (Fig. 3c), but differently to p.Arg18Lys, there is no report of this variant in other studies, and it is characterized as possibly pathogenic by two prediction tools. Recently, a study encompassing the functional characterization of 44 PALB2 missense variants evidenced that both variants are not affecting the evaluated PALB2 protein functions [96].
The last risk-associated gene was PMS2, which presented two C-terminal variants located in the MutL domain that together with the N-terminal region constitute the MLH1 binding region (Fig. 3d). This region is important for MutLα heterodimers formation, necessary for the correct mismatched DNA fragment excision [97]. The p.Val717Met (c.2149G > A) is a VUS that presents conflicting information of pathogenicity by ClinVar database and only AlignGVGD does not predict it as pathogenic. Functional assays have demonstrated a protein stability and MMR proficiency, however, the samples carrying this variant presented microsatellite instability [98]. The p.Asp792Asn (c.2374G > A) variant was identified in a gastric diffuse cancer patient, the only man in our cohort, which ended in death 3 years after the diagnosis. It has been described as presenting a moderate decrease in mismatch repair activity [99], which corroborates with our analysis association. Due to this, we suggest that these variants may be related to increased risk to HBOC, but segregation studies and functional characterization are mandatory to access the contribution of these variants to HBOC etiology.

Conclusions
Our study is comprised of the third multi-gene screening in HBOC patients in the Brazilian population, showing a higher frequency of pathogenic variants than previously reported [24]. In addition, our work expands the landscape of variants linked to HBOC syndrome in the Brazilian population, and also depicts the first report of the novel ATR missense variant p.Pro932Thr (c.2794C > A). This study also presents a descriptive characterization of variants found in HBOC patients, evidencing about 16% of patients carrying variants significantly associated with HBOC risk, and constitutes the first report of missense variants on ABRAXAS1, BARD1, BRCA2, CHEK2, PALB2 and PMS2 in Brazil. As well as segregation analyses and functional characterization, which are mandatory to confirm the deleteriousness of the variants described here, these results bring insights to the contribution of other genes to HBOC pathogenesis. Our data also aggregates epidemiologic information about the prevalence of germline variants in DNA repair genes in the Brazilian population, which together with further characterization will help guide the clinical decision and risk assessment for patients at increased risk to HBOC in the future.
Additional file 1: Table S1. Clinical characterization of the HBOC cohort. NI = Not-informed; CNS = Central Nervous System. The tumor size, lymph node staging and the metastasis status are reported according to MOC Brazil guidelines for tumor staging (https://mocbrasil.com/). Additional file 2: Table S2. Haplotype estimation for five high frequency BRCA1 SNVs in three different Cancer Centers in Brazil. Here we show only haplotypes with frequencies higher than 1%. In red, the haplotype identified as significantly more frequent in the elderly cohort, in our HBOC cohort analysis. In bold, the haplotype that was significantly more frequent in the control group of all three other Brazilian Cancer Centers. HPOA = Hospital das Clínicas de Porto Alegre, Porto Alegre, RS, Brazil; ACC = A.C. Camargo Cancer Center, São Paulo, SP, Brazil; HCB = Barretos Cancer Hospital, Barretos, SP, Brazil. Hp = estimated haplotypes. Hap. freq. = haplotype frequency. pvalue = haplotype score statistic p-value calculated by Haplo.stats. NA = when the haplotype score statistic p-value could not be calculated.