Skip to main content
  • Research article
  • Open access
  • Published:

Clinically relevant combined effect of polygenic background, rare pathogenic germline variants, and family history on colorectal cancer incidence


Background and aims

Summarised in polygenic risk scores (PRS), the effect of common, low penetrant genetic variants associated with colorectal cancer (CRC), can be used for risk stratification.


To assess the combined impact of the PRS and other main factors on CRC risk, 163,516 individuals from the UK Biobank were stratified as follows: 1. carriers status for germline pathogenic variants (PV) in CRC susceptibility genes (APC, MLH1, MSH2, MSH6, PMS2), 2. low (< 20%), intermediate (20–80%), or high PRS (> 80%), and 3. family history (FH) of CRC. Multivariable logistic regression and Cox proportional hazards models were applied to compare odds ratios and to compute the lifetime incidence, respectively.


Depending on the PRS, the CRC lifetime incidence for non-carriers ranges between 6 and 22%, compared to 40% and 74% for carriers. A suspicious FH is associated with a further increase of the cumulative incidence reaching 26% for non-carriers and 98% for carriers. In non-carriers without FH, but high PRS, the CRC risk is doubled, whereas a low PRS even in the context of a FH results in a decreased risk. The full model including PRS, carrier status, and FH improved the area under the curve in risk prediction (0.704).


The findings demonstrate that CRC risks are strongly influenced by the PRS for both a sporadic and monogenic background. FH, PV, and common variants complementary contribute to CRC risk. The implementation of PRS in routine care will likely improve personalized risk stratification, which will in turn guide tailored preventive surveillance strategies in high, intermediate, and low risk groups.

Peer Review reports


Colorectal cancer (CRC) is the fourth leading cancer-related cause of death worldwide. Major established exogenous risk factors are summarized as Western lifestyle [1]. However, an inherited disposition contributes significantly to the disease burden since up to 35% of interindividual variability in CRC risk has been attributed to genetic factors [2, 3].

Around 5% of CRC occur on the basis of a monogenic, Mendelian condition (hereditary CRC), in particular Lynch syndrome (LS) and various gastrointestinal polyposis syndromes. Here, predisposing rare, high-penetrance pathogenic variants (PV, constitutiona/germline variants) result in a considerable cumulative lifetime risk of CRC and a syndrome-specific spectrum of extracolonic tumors. The autosomal dominant inherited LS is by far the most frequent type of hereditary CRC with an estimated carrier frequency in the general population of 1:300–1:500 [4,5,6]. It is caused by a heterozygous germline PV in either of the mismatch repair (MMR) genes MLH1, MSH2, MSH6 or PMS2 or, in few cases, by a large germline deletion of the EPCAM gene upstream of MSH2. The most frequent Mendelian polyposis syndrome is the autosomal dominant Familial Adenomatous Polyposis (FAP) caused by heterozygous germline PV in the tumor suppressor gene APC, followed by the autosomal recessive MUTYH-associated polyposis (MAP) which is based on biallelic germline PV of the base excision repair gene MUTYH [7, 8]. However, even in such monogenic conditions, the inter- and intrafamilial penetrance and phenotypic variability is striking, pointing to modifying exogenous or endogenous factors. Heterozygous (monoallelic) MUTYH germline PV may be associated with a slightly increased CRC risk [9, 10]; the carrier frequency in northern European populations is estimated to be 1:50–1:100 [4].

Approximately 20–30% of CRC cases are characterized by a suspicious, but unspecific familial clustering of CRC (familial CRC). Around 25% of CRC cases occur before 50 years of age (early-onset CRC); in around one quarter of those a hereditary type (mainly LS) has been identified [11]. Although further high-penetrance candidate genes have been proposed [12,13,14], the majority of familial and early-onset cases cannot be explained by monogenic subtypes and instead are supposed to result from a multifactorial/polygenic etiology including several moderate-/intermediate penetrance risk variants and shared environmental/lifestyle factors. A positive family history (FH) in first- and second-degree relatives increases the risk of developing CRC by 2- to ninefold [15, 16], which underpins the hypothesis of shared genetic and non-genetic risk factors.

A variety of models to predict CRC risk has been developed and evaluated, which include clinical data, FH, lifestyle factors, and genetic information [17].For more than a decade, genome-wide association studies (GWAS) in large unselected CRC cohorts identified an increasing number of common, low-penetrance risk variants, mainly single nucleotide polymorphisms (SNPs), which are significantly associated with CRC risk [18,19,20,21]. Each SNP risk allele individually contributes only little to CRC risk (OR 1.05 to 1.5), however, summarised in quantitative polygenic risk scores (PRS), the combined effect might explain a substantial fraction of CRC risk variability and can identify individuals at several times lower and greater risk than the general population [22,23,24].

As such, it is expected that the genetic background defined by the common risk variants may not only influence the occurrence of late-onset sporadic cases, but also modulate the risk of familial, early-onset, and hereditary CRC [25]. Recent studies demonstrated that high PRS values are associated with an increased risk of CRC and other common cancers in the general population up to an order of magnitude that is almost similar to hereditary tumor syndromes [26, 27].

Based on these data, it can be hypothesized, that the identification of common genetic CRC risk variants not only provides deep insights into the biological mechanisms and pathways of tumorigenesis, but could improve personalized risk stratification for sporadic, familial/early-onset, and hereditary CRC in the future by the implementation of SNP-based PRS screening in routine patient care, which will in turn guide tailored preventive strategies in high, moderate, and low risk groups.

However, even if previous studies provide promising results for a clinical benefit of a PRS-based personalized risk stratification, the impact of common risk factors and their interplay with high-penetrance variants and other unspecified factors, captured partly by the FH, still has to be improved and validated in additional patient cohorts.

In the present work, we compare the prevalence and the lifetime risk of CRC among 163,516 individuals from a population-based European repository (UK Biobank, UKBB). Individuals were stratified according to three major risk factors 1) their carrier status of rare, high-penetrance.


Data source

UK Biobank (UKBB) genetic and phenotypic data were used in this study. UKBB is a long-term prospective population-based cohort study that has recruited volunteers mostly from England, Scotland, and Wales, with over 500,000 participants aged 40 to 69 years at the time of recruitment. For each participant, extensive phenotypic and health-related data is available; genotyping data is accessible for 487,410 samples, and exome sequencing data is available for 200,643 people. All participants gave written consent, and the dataset is available for research. UKBB provided follow-up information by linking health and medical records [28].

Study participants

CRC cases were defined based on self-reported code of 1022 or 1023 (in data field 20,001), or ICD-10 code of C18.X or C20.X, D01.[0,1,2], D37.[4, 5], or ICD-9 of 153.X or 154.[0,1] (in hospitalization records). Control samples were those that had no previous diagnosis of any cancer. The study includes people of all ethnicities. Outliers for heterozygosity or genotype missing rates, putative sex chromosome aneuploidy, and discordant reported sex versus genotypic sex were excluded. Only individuals (n = 200,643) who had both genotyping and whole-exome sequencing (WES) data were considered. If the genetic relationship between individuals was closer than the second degree, defined as kinship coefficient > 0.0884 as computed by the UK Biobank, we removed one from each pair of related individuals (cases were retained if exist).

Variant selection

We used ANNOVAR [29] to annotate the VCF files from the 200,643 WES samples. The Genome Aggregation Database (gnomAD) [30] were used to retrieve variant frequencies from the general population. We focused on rare PV for hereditary CRC (Lynch syndrome, polyposis) and considered the same variant filtering approach that was used in a recent study aiming at selecting rare PV [31]. The following inclusion criteria were used: (1) only APC, MUTYH, MLH1, MSH2, MSH6, PMS2 variants in protein-coding regions were included since PV in other genes associated with hereditary CRC are too rare or even absent in the study population; (2) allele frequency (AF) < 0.005 in at least one ethnic subpopulation of gnomAD; (3) not annotated as “synonymous,” “non-frameshift deletion” and “non-frameshift insertion”; (4) annotated as “pathogenic” or “likely pathogenic” based on ClinVar [32]. We did not include MUTYH in the pooled analysis since no biallelic (i.e. high penetrance) case was identified in the cohort; however, we included the heterozygous (monoallelic) carriers in the single gene analysis to compare the effect size with the other genes.

Polygenic risk scores (PRS)

We applied a previously validated PRS for CRC with 95 variants to calculate the PRS [18]. The PRS was estimated using the PLINK 2.0 [33] scoring function through UKB genotype data. To reduce PRS distributions variance among genetic ancestries, we used a previous approach [34]. We used the first four ancestry principal components (PCs) to fit a linear regression model to predict the PRS across the full dataset (pPRS ~ PC1 + PC2 + PC3 + PC4). Adjusted PRS (aPRS) were calculated by subtracting pPRS from the raw PRS and used for the subsequent analysis.

In addition, we calculated the PRS using 140 SNPs [18] and another PRS based on 50 SNPs that were replicated in the meta-analyzed GWAS after excluding UKBB samples [35]. Thus, in total three PRS models were computed: (1) 95 SNPs (95 PRS); (2) 140 SNPs (140 PRS); (3) 50 SNPs (50 PRS).

Statistical analysis

Individuals were divided into groups depending on (1) carrier status of PV, (2) PRS, and (3) FH. For FH, we considered participants’ reports of CRC in their parents and siblings (data fields: 20,110, 20,107, 20,111). For PRS, individuals were assigned into three groups: low (< 20% PRS), intermediate (20–80% PRS), and high (> 80% PRS) where the definition of a high PRS (above the 80th percentile) corresponding to OR >  = 2.

We conducted both an analysis specific to single genes and a combined analysis (i.e., carriers of PV in APC, MLH1, MSH2, MSH6 and PMS2). First, we estimated the OR for each carrier group based on a logistic regression adjusting for age at recruitment, sex, CRC screening status, and the first four ancestry PCs. Afterwards, we additionally incorporated interactions between PV carriers and FH with PRS by introducing an interaction term within the logistic regression model.

We calculated the lifetime risk by age 75 from carrier status of rare PV and the PRS and hazard ratios (HRs) based on a Cox proportional hazards model. Individual’s age served as the time scale, representing the time to event, for observed cases (age at diagnosis), and censored controls (age at last visit); age 0 was used as index time. Carrier status, PRS category, FH, age, sex, CRC screening status, and the first four ancestry PCs were incorporated in the model, and adjusted survival curves were produced. The information about FH and CRC screening is based on interviews at the time of study recruitment. However, information about the timing or result of CRC screening was not available. We therefore included this information as a binary covariate to account for both effects.

Model performance was assessed via the area under the receiver operating characteristic curve (AUC), Nagelkerke's Pseudo-R2, and the C-index for time-to-event data. R 3.6.3 with the corresponding add-on packages survival and survminer was used for all statistical analyses.


Stratification of UKBB individuals for CRC prevalence, FH, and PV carrier status

We identified 1,902 CRC cases (894 prevalent cases and 1,008 incident cases) among the 163,516 UKBB individuals that retained after exclusion criteria, with a mean age at diagnosis of 60.9 years. The remaining 161,614 individuals with no previous diagnosis of any cancer were considered as controls, with a mean age of 56.9 years at last visit (Table 1). The European population represents 92% of the analyzed cohort.

Table 1 Characteristics of the 163,516 UK Biobank participants by colorectal cancer (CRC) status

The fraction of individuals with a positive FH of CRC is significantly higher in cases (19%) compared to controls (11%) (OR = 1.95 [1.73–2.19], P < 0.01) and ranges between 9 and 23% in the subgroups (Table 2). There is a significantly higher proportion of individuals with a FH of CRC not only among carriers of PV in the selected cancer susceptibility genes (OR = 1.96 [1.72–2.20], P < 0.01), but also among non-carriers with high PRS (OR = 1.60 [1.31–1.94], P < 0.01).

Table 2 Characteristics of the UK Biobank participants by carrier status and polygenic risk score (PRS) strata

In the analyzed CRC susceptibility genes APC, MLH1, MSH2, MSH6, PMS2, we identified 399 heterozygous carriers of 111 PV. They were present in 30 (1.57%) cases and 369 (0.23%) controls, which is in line with published data. A list of the considered variants and annotations is shown in Additional file 1: Table S1, a summary of the number of PV carriers per gene is provided in Additional file 2: Table S2. No individual with a homozygous PV was identified. In other known genes associated with hereditary CRC (BMPR1A, POLE, POLD1, RNF43, SMAD4, STK11), the number of (L)P variant carriers was extremely low or no variant carrier was present at all, so that these genes were not considered in the analysis.

PRS distribution within the UKBB cohort

CRC PRS follow a normal distribution both regarding raw and PC-adjusted PRS (Additional file 2: Fig. S1) and is significantly higher in cases compared to controls, regardless of which PRS model is used (95 PRS, 140 PRS or 50 PRS), the PRS is significantly higher in cases compared to controls (Additional file 2: Fig. S2). The OR for 50 PRS (1.74 [1.57–1.92]) is slightly lower than that of 95 PRS (1.98 [1.79–2.19]), or 140PRS (1.92 [1.74–2.12]); that might be due to overfitting.

Since we included only individuals with both genotyping and WES data, we investigated the distribution of the PRS and age in the whole cohort and compared it to the subcohort with WES data. Density plots show that the distribution of PRS and age was similar between both groups (Additional file 2: Fig. S3).

The prevalence of CRC according to PRS percentiles demonstrates that values in the extreme right tail of the PRS distribution are associated with a non-linear increase of CRC risk, whereas in the left tail a less evident non-linear decrease can be observed (Additional file 2: Fig. S4). This supports the hypothesis of using PRS to stratify individuals into risk classes (i.e., low, intermediate, and high risk) according to a liability threshold model.

Interplay between PV and PRS

There was no overlap between the selected rare high penetrance PV and the common SNPs used for PRS calculation, and thus, the PRS represents an additional genetic signal. Notably, the PRS distributions showed that the mean of PRS is significantly higher in affected carriers compared to unaffected carriers (P < 0.01) (Additional file 2: Fig. S5).

We assessed how CRC risk is influenced by PRS and carrier status for PV in high penetrant CRC susceptibility genes (APC, MLH1, MSH2, MSH6, PMS2) by calculating the ORs for CRC across groups compared to non-carriers with intermediate PRS as reference group. Non-carriers with a low or high PRS are estimated to have a 0.5-fold or 2.1-fold change in the odds for CRC, respectively. We observed that the PRS also alters the penetrance of PV in susceptibility genes considerably as PV carriers with high PRS had four times higher OR than carriers with low PRS (OR = 17.5 and 3.9, respectively; Fig. 1A; and corresponding HR in Additional file 2: Table S3). We did not observe a significant interaction between PV carrier status and PRS (p = 0.87). In addition, we performed a sensitivity analysis including only the incident cases (n = 1,008). We observed the same trend, that PRS provides an OR risk gradient in the general population and among carriers of pathogenic variants in CRC susceptibility genes (Additional file 2: Table S4).

Fig. 1
figure 1

Colorectal cancer odds ratio and cumulative incidence stratified by carrier and family history status. Individuals stratified for PV carrier status (A + B), and family history (first-degree relative with CRC) (C + D) into three strata based on their polygenic risk score (PRS): Low (< 20% percentile), intermediate (20–80% percentile), or high (> 80% percentile) PRS. The odds ratio (OR) was calculated from a logistic regression model with age, sex, CRC screening status, and the first four principal components of ancestry as covariates. The reference group was non-carriers with intermediate PRS (A), and no family history with intermediate PRS (C). The adjusted OR is indicated by the colored boxes. The numbers next to the ORs indicate the sample size of the corresponding group. The 95% confidence intervals are indicated by the vertical lines around the boxes. Cumulative incidence was estimated from a cox-proportional hazard model using age, sex, family history, CRC screening status, and the first four ancestry principal components as covariates

The high PRS, which is by definition present in 20% of the non-carriers, is associated with an almost doubled CRC risk (Fig. 1A, Table 2). Since the vast majority (97.9%) of non-carriers are controls (= healthy), almost the same percentage results if only healthy non-carriers are considered. We performed the same analysis using the 140 PRS and 50 PRS. All the three PRS models had comparable performance in the UKBB cohort (Additional file 2: Fig. S6).

Similarly, the lifetime cancer risk analysis shows a combined impact of PV and PRS: Among carriers, the estimated cumulative incidence by age 75 increased from 40% in case of a low PRS to 74% in case of a high PRS compared to 6% to 22% for non-carriers (Fig. 1B, Additional file 2: Table S3).

Inclusion of family history on cancer risk stratification

Taking individuals with no FH and intermediate PRS as a reference, both FH and PRS are associated with a higher CRC risk (Fig. 1C, Additional file 2: Table S5). The CRC risk for individuals having low PRS and no FH (OR 0.6) is five times lower than for individuals having both positive FH and high PRS (OR 3.1). We did not observe a significant interaction between FH status and PRS (p = 0.12). Noteworthy, individuals without FH and high PRS and individuals with FH and intermediate PRS both have similar CRC risks with an OR of around 2, whereas the CRC risk of individuals having low PRS even in the context of a FH is decreased compared to the reference group.

Among individuals with FH, the cumulative CRC incidence by age 75 increases threefold from 8% in case of a low PRS to 26% in case of a high PRS (Fig. 1D). Noteworthy, the cumulative CRC incidence of individuals with a positive FH and an intermediate PRS is lower (16%) than for individuals with negative FH and a higher PRS category (21%), respectively.

The full model integrating PRS, FH, and PV status shows that the CRC risk is strongly influenced by PRS in all groups (Fig. 2, Additional file 2: Table S6). Considering the non-carriers with no FH and intermediate PRS group as reference, the CRC OR in low PRS is 0.6 for non-carriers with no FH, while it is estimated more than 60 times higher (OR 40) for carriers with FH and high PRS (Fig. 2A). The corresponding cumulative CRC incidences are 6% and 98%, respectively (Fig. 2B). Although all PV carriers showed a significantly increased CRC risk, both the PRS and FH modify these risks considerably: depending on the FH and PRS, the OR in PV carriers vary between 4 and 40 and the cumulative incidence between 35 and 98%. Despite the CRC screening status is a key predictor for CRC risk (Additional file 2: Fig. S7), the main findings of the analysis were maintained irrespective of the screening status.

Fig. 2
figure 2

Interplay of pathogenic variant carrier status, family history, and polygenic risk score. A Colorectal cancer (CRC) odds ratios (ORs) were estimated from logistic models adjusted for age, sex, CRC screening status, and first four ancestry principal components. Non-carriers with intermediate PRS and no family history served as the reference group. B Cumulative incidence was estimated from a cox-proportional hazard model using age, sex, family history, CRC screening status and the first four ancestry principal components as covariates

PRS improved model discrimination over carrier status and FH of CRC in first-degree relatives. The AUC derived from PRS (0.688) was higher compared to those derived using FH (0.654) and carrier status (0.646). The full model including PRS, carrier status, and FH improved the AUC (0.704) in risk prediction by 1.6%, 5%, and 5.8%, respectively, and was also better than any combination of two factors (Table 3, Additional file 2: Fig. S8a). We also performed an analysis in which age and sex were excluded. The AUCs demonstrate that the PRS still has a high discriminative power for CRC risk prediction (Additional file 2: Fig. S8b).

Table 3 Model discrimination assessed for combinations of polygenic risk score, family history of CRC and carrier

The impact of polygenic risk in single gene mutation carriers

The gene-specific analysis revealed a strong variability in risk conferred by rare heterozygous PV in the different genes. The largest effect sizes are attributable for MLH1 and APC, those for MSH2 and MSH6 are a bit less, while the effect size for PMS2 is considerably lower (Fig. 3). When heterozygous MUTYH variants are included in this analysis, the risks are very similar to the PMS2-related risks. Both the PMS2 and heterozygous MUTYH risks show a broad overlap with the non-carrier risks, while there is no overlap between the risks of non-carriers and those with PVs in MLH1, MSH2, MSH6, and APC.

Fig. 3
figure 3

Interplay of pathogenic variant carrier status, family history, and polygenic risk score in single genes. Odds ratios (ORs) for colorectal cancer (CRC) were estimated from logistic models adjusted for age, sex, CRC screening status, and first four ancestry principal components. Non-carriers with intermediate PRS and no family history served as the reference group

We estimated how PRS and FH influence CRC prevalence among PV carriers in each of the five susceptibility genes (Additional file 2: Table S7). Despite the different effect sizes, the PRS and FH modifies the relative risk across all genes; however, the effect of PRS and FH is conversely related to the penetrance of the gene with the smallest effects in MLH1 PV carriers.

As for the overall analysis, in the gene-specific analysis a positive FH, a PV in a cancer risk gene, and a high PRS are associated with an increased CRC risk. As such, an individual with a low-penetrance PMS2 PV, but high PRS and/or positive FH ends up with an estimated CRC risk similar to a MSH6 PV carrier without FH and/or low PRS (Additional file 2: Fig. S9, Table S7).


Recent studies demonstrated that the polygenic background, defined as PRS based on disease-associated SNPs, modifies the risks for several cancers of the general population including CRC considerably, both in terms of age at onset and cumulative lifetime risks [12, 23, 27, 36,37,38]. In line with this, the risk alleles of those SNPs are found to also accumulate in unexplained familial and early-onset CRC cases [25, 39]. Whereas a low polygenic burden decreases the CRC risk down to one quarter on average, individuals with a high PRS (> 80%) doubles and those with a very high PRS (99%) almost quadruplicate their risk and thus, reach a CRC risk in an order of magnitude almost comparable to carriers of hereditary CRC with low PRS [31]. In a pervious study, Jia et al. found that the risk of CRC is significantly associated with its PRS: Compared with individuals in the lowest PRS quintile those in the highest quintile had a greater than threefold risk (during a 5.8-year follow-up period). Hazard Ratios estimated with the middle quintile as the reference resulted in a risk between 0.56 and 1.71, a threefold risk in those in the top 1% of PRS, and a 70% reduced CRC risk for individuals in the bottom 1% of the PRS [38].

To extend these studies on how the CRC prevalence is influenced by genetic susceptibility using, we used the sufficiently larger, more robust dataset of the most recent UKBB cohort, incorporate the family history (FH) as an additional factor for risk stratification, and include a single gene analysis. We considered both the genetic component driven by rare high-penetrance PV associated with hereditary CRC and common low-penetrance variants captured by the PRS.

Firstly, our results confirm that the polygenic background strongly modulates CRC risk in the general population. Compared to the average polygenic burden, individuals with a low (< 20%) or high (> 80%) PRS are estimated to have a 0.5-fold or 2.1-fold change in the odds for CRC, respectively. The additional time-to-event analysis revealed a corresponding cumulative lifetime risk of 6% and 22% by age 75. Hence, when the PRS is included in risk calculation, around 20% of healthy individuals of the general population with no FH of CRC have a doubled CRC risk, which is similar to those with a first degree relative affected by CRC [40]. These so far unknown and otherwise unrecognisable at-risk individuals might need surveillance 10–15 years earlier than usually recommended [41]. On the other hand, the around 20% of individuals with low PRS and no FH might need less surveillance than the general population due to a considerably lowered risk, while even those with low PRS and positive FH might not need a more intense surveillance than the general population.

A concern in evaluating CRC PRS using 95 or 140 SNPs [18] in UKBB studies is that the calculation is based on summary statistics derived from a GWAS meta-analysis that included findings from the UKBB. Previous studies have also used 95 or 140 SNPs, but it is uncertain if this could result in overfitting of models. A recent study [35] addressed this issue using stringent inclusion criteria, only including 50 SNPs that reached GWAS significant (p < 5 × 10–8) in the meta-analysis after excluding UKBB samples. The effect sizes from meta-analysis of these 50 SNPs were then used to conduct the 50 PRS. The slightly lower OR of the 50 PRS in the present study compared to the 95 and 140 PRS might be due to overfitting; however, by comparing the PRS calculations, we could show that all three PRS models had a comparable performance in the UKBB cohort (Additional file 2: Figs. S2 and S6).

It is well known that among patients with hereditary CRC syndromes, the age of onset and cumulative CRC incidence is very heterogeneous, even within PV carriers of the same family. The estimated gene-specific, individual CRC lifetime risks of LS patients with MLH1 or MSH2 PV can be lower than 10% but as high as 90–100% in a considerable fraction. In the past, the analysis of modifying effects based on common CRC-associated variants in LS and other high-risk groups has been restricted to selected cohorts and small subsets of SNPs [42, 43]. A recent study demonstrated that the polygenic background also substantially influences the CRC risk in LS using UKBB data, even though the ORs for CRC risks could only be predicted due to the small sample sizes [31]. In the present work, ORs could be calculated directly from the model since over three times more UKBB individuals have been included with six times more CRC cases, and five times more PV carriers.

So secondly, we were able to show that the PRS modifies the CRC risks not only in the general population considerably, but also in carriers of a MMR gene PV identified in the general population. For the first time we demonstrated, that this is also true for APC PV. Depending on the PRS, the cumulative CRC lifetime incidence in PV carriers ranged between 40 and 74%, and thus, the PRS is able to explain parts of the interindividual variation in CRC risk among PV carriers.

However, the single-gene analysis revealed heterogeneous effects across genes and therefore the modifying role of the polygenic background should be framed within the absolute risk attributable to individual genes. As expected, the effect of the PRS seems to be relevant in particular in less penetrant CRC risk genes such as PMS2 where the OR ranges between 0.94 and 5.43 respectively (Additional file 2: Table S6). This is in line with findings in moderate breast cancer risk genes such as CHEK2, PALB2 and ATM [44,45,46] and suggests that PRS inclusion in risk stratification may in particular be relevant to prevent excess of surveillance measures in PV carriers of those genes.

In addition, our results provide evidence that the inclusion of FH can further and independently improve the risk stratification in both carriers and non-carriers. Including PRS and FH in risk assessment, the cumulative CRC lifetime incidence ranged between 8 and 26%, and in PV carriers between 30 and 98%, and thus, outperformed the consideration of a single risk factor. This suggests that familial clustering points to additional risk factors besides those captured by common low-risk SNPs (PRS) and rare PV [47, 48]. These might be common and rare structural genetic alterations including copy number variants, rare non-coding variants, or other intermediate and low-impact risk variants not included routinely in PRS models, and non-genetic contributors such as environmental/lifestyle factors.

Only few PRS studies considered the FH. In line with our results, Jenkins et al. found no correlation between SNP-based and FH-based risks and an improved risk stratification when both PRS and FH are considered [47]. In the analyses by Jia et al., the AUC derived from PRS (0.609) was substantially higher compared to the one derived using FH (0.523). Adding PRS and FH of cancer in first-degree relatives improved the model’s discriminatory performance (AUC 0.613) [17, 49]. Our AUC calculations point in the same direction with a higher AUC (0.704) when all three risk factors (PRS, FH, carrier status) are considered.

Interestingly and in apparent contrast to our results and those of others, a study using 826 European-descent carriers of PV in the DNA MMR genes MLH1, MSH2, MSH6, PMS2, and EPCAM (i.e. LS carriers) from the Colon Cancer Family Registry (CCFR) did not find evidence of an association between the PRS and CRC risk, irrespective of sex or mutated gene, although an almost identical set of SNPs was used for PRS calculations [50]. A reason which might partly explain different risk estimates between studies using individuals from a population-based repository such as the UKBB and those using curated clinical data registries, where patients/families with suspected hereditary disease are included (e.g. the CCFR), is a potentially different risk composition across cohorts recruited in different ways (recruitment bias). That way, a familial clustering of CRC might reflect the existence of several genetic and non-genetic risk factors as outlined above, which are not captured by the PRS and which may superimpose the polygenic impact.

In particular, the composition of cases and controls is different between the Jenkins et al. study on the one hand and the Fahed et al. and present study on the other hand. In the Jenkins et al. study, obviously both cases (i.e., PV carriers with CRC) and controls (healthy PV carriers) derived from the same LS families, while the UKBB controls are PV carriers not apparently related to the PV cases. This is also reflected by the different ratio between cases and controls (7.5% CRC cases among PV carriers in the present study, but 61% in the Jenkins et al. study). Hence, the controls in the Jenkins et al. study are relatives of the cases and thus, it is likely that they share parts of the polygenic background and other risk factors of their affected relatives (cases) to a certain extent which may explain the observed missing effect of the PRS. The comparison between population-based and registry-based predictions indicates that the study design and recruitment strategy may strongly influence the results and conclusions. Consequently, the application of PRS in clinical practice should consider the familial background and ascertainment of the patient.

Our data analyses provide evidence that the PRS acts as a relevant risk modifier for CRC among both the general population and population-based PV carriers in genes causing hereditary CRC. The findings of us and others qualify the PRS as important component of risk stratification and resulting risk-adapted surveillance strategies in terms of age of onset and frequency. Given the risk distribution across PRS groups, the PRS can define a considerable proportion of the general population at a CRC risk level which is considered sufficient for a more or a less intensive surveillance. Importantly, the non-carriers with high PRS are a much larger target group compared to PV carriers and thus might generate an even higher preventive effect form a healthcare perspective. A small group of non-carriers with positive FH and high PRS even has CRC risks almost in the same order of magnitude as LS carriers without additional risk factors and thus may need similar intensive surveillance measures.

According to these findings, there should be a potential benefit for both the general population and at-risk individuals carrying PV, from the inclusion of PRS in healthcare prevention policies, as risk-stratified surveillance improves early disease detection and prevention. A recent study demonstrated that individuals with a higher genetic risk benefited more substantially from preventive measures than those with a lower risk: CRC screening was associated with a significantly reduced CRC incidence and more than 30% reduced mortality among individuals with a high PRS high PRS [51, 52]. Preliminary calculations indicate that polygenic-risk-stratified CRC screening could become cost-effective under certain conditions including an AUC value above 0.65 which was reached in our analyses [53].

Based on the striking different penetrance between individual hereditary CRC genes, very recent guidelines start to recommend a more gene-specific surveillance intensity in LS and polyposis [54, 55]. Given the strong modifying effect, the inclusion of additional risk factors will result in a more appropriate, clinically relevant risk stratification. Our results demonstrate that a combined risk assessment including FH and PRS will likely improve precise risk estimations and tailored preventive measures not only in the general population, but also in patients with hereditary disease.

Our study has some limitations. Firstly, there is evidence of a “healthy volunteers” selection bias of the UKBB population (UKBB participants tend to be healthier than the general population), and thus the results might not be completely generalizable in terms of effect sizes [56]. Secondly, we cannot exclude that few carriers of APC PV who were classified as controls, are affected by a polyposis but have not been recognized as such or did not develop CRC due to intensive surveillance and/or prophylactic surgery, so that the calculated CRC risk of APC PV might be slightly underestimated. As in other similar studies, the presence of colorectal polyps could not be considered due to the lack of appropriate data. Thirdly, to increase the power of the analysis, our risk assessment was based solely on genetic variants and FH and did not include other risk factors. Previous studies on UKBB cohorts showed that lifestyle modifiable risk factors play a pivotal role in cancer prevalence, and a shared lifestyle within families could influence FH with the disease [49, 57]. That might explain the partly independent association of the FH and the genetic risk. Finally, although we performed the analysis on the whole UKBB cohort, we could not test the risk stratification generalizability across different populations due to the limited sample size. PRS could be biased towards the European population as PRS was constructed based on European reference GWAS. Thus, these PRS might be a worse predictor in non-European or admixed individuals, as previously discussed in different studies [58].


In conclusion, we show the important role of PRS and FH on CRC risk in both the general population and population-based carriers of a monogenic predisposition for CRC. The combined effect of common variants can strongly alter the age-related penetrance and life-time risk of CRC. Thus, the PRS represents an additional, independent stratification level to cancer risk besides the FH and lifestyle factors and likely increase the accuracy of risk estimation. Consequently, PRS can define a relevant proportion within the general population as a risk group, which should be considered as subjects for more intense surveillance measures, and in addition point to a striking risk variability even among carriers of hereditary CRC, which requires more personalized, risk-adapted surveillance strategies. As expected, the modifying effect of the PRS seems to be relevant in particular for moderate penetrant CRC risk genes. When important modifiers such as polygenic background, FH, and non-genetic factors are included in risk assessment, the dichotomous risk division between sporadic and hereditary CRC will be partly replaced by a more continuous risk distribution.

Availability of data and materials

Genome-wide genotyping data, exome-sequencing data, and phenotypic data from the UK Biobank are available upon successful project application ( Restrictions apply to the availability of these data, which were used under license for the current study (Project ID: 52,446). Summary statistics are available from the Polygenic Score Catalog (


  1. Carr PR, Weigl K, Edelmann D, Jansen L, Chang-Claude J, Brenner H, et al. Estimation of absolute risk of colorectal cancer based on healthy lifestyle, genetic risk, and colonoscopy status in a population-based study. Gastroenterology. 2020;159:129–38.

    Article  CAS  PubMed  Google Scholar 

  2. Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, et al. Environmental and heritable factors in the causation of cancer–analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med. 2000;343:78–85.

    Article  CAS  PubMed  Google Scholar 

  3. Czene K, Lichtenstein P, Hemminki K. Environmental and heritable causes of cancer among 9.6 million individuals in the Swedish Family-Cancer Database. Int J Cancer. 2002;99:260–6.

    Article  CAS  PubMed  Google Scholar 

  4. Win AK, Jenkins MA, Dowty JG, Antoniou AC, Lee A, Giles GG, et al. Prevalence and penetrance of major genes and polygenes for colorectal cancer. Cancer Epidemiol Biomarkers Prev. 2017;26:404–12.

    Article  CAS  PubMed  Google Scholar 

  5. Biller LH, Syngal S, Yurgelun MB. Recent advances in Lynch syndrome. Fam Cancer. 2019;18:211–9.

    Article  CAS  PubMed  Google Scholar 

  6. Grzymski JJ, Elhanan G, Morales Rosado JA, Smith E, Schlauch KA, Read R, et al. Population genetic screening efficiently identifies carriers of autosomal dominant diseases. Nat Med. 2020;26:1235–9.

    Article  CAS  PubMed  Google Scholar 

  7. Talseth-Palmer BA. The genetic basis of colonic adenomatous polyposis syndromes. Hered Cancer Clin Pract. 2017;15:5.

    Article  PubMed  Google Scholar 

  8. Kanth P, Grimmett J, Champine M, Burt R, Samadder NJ. Hereditary colorectal polyposis and cancer syndromes: a primer on diagnosis and management. Am J Gastroenterol. 2017;112:1509–25.

    Article  PubMed  Google Scholar 

  9. Vogt S, Jones N, Christian D, Engel C, Nielsen M, Kaufmann A, et al. Expanded extracolonic tumor spectrum in MUTYH-associated polyposis. Gastroenterology. 2009;137:1976–85.

    Article  CAS  PubMed  Google Scholar 

  10. Win AK, Dowty JG, Cleary SP, Kim H, Buchanan DD, Young JP, et al. Risk of colorectal cancer for carriers of mutations in MUTYH, with and without a family history of cancer. Gastroenterology. 2014;146:1208–11.

    Article  CAS  PubMed  Google Scholar 

  11. Stoffel EM, Koeppe E, Everett J, Ulintz P, Kiel M, Osborne J, et al. Germline genetic features of young individuals with colorectal cancer. Gastroenterology. 2018;154:897–905.

    Article  CAS  PubMed  Google Scholar 

  12. Chubb D, Broderick P, Dobbins SE, Frampton M, Kinnersley B, Penegar S, et al. Rare disruptive mutations and their contribution to the heritable risk of colorectal cancer. Nat Commun. 2016;7:11883.

    Article  CAS  PubMed  Google Scholar 

  13. Schubert SA, Morreau H, de Miranda NFCC, van Wezel T. The missing heritability of familial colorectal cancer. Mutagenesis. 2020;35:221–31.

    Article  CAS  PubMed  Google Scholar 

  14. Yurgelun MB, Kulke MH, Fuchs CS, Allen BA, Uno H, Hornick JL, et al. Cancer susceptibility gene mutations in individuals with colorectal cancer. J Clin Oncol. 2017;35:1086–95.

    Article  CAS  PubMed  Google Scholar 

  15. Brenner H, Hoffmeister M, Haug U. Family history and age at initiation of colorectal cancer screening. Am J Gastroenterol. 2008;103:2326–31.

    Article  PubMed  Google Scholar 

  16. Butterworth AS, Higgins JP, Pharoah P. Relative and absolute risk of colorectal cancer for individuals with a family history: a meta-analysis. Eur J Cancer. 2006;42:216–27.

    Article  PubMed  Google Scholar 

  17. McGeoch L, Saunders CL, Griffin SJ, Emery JD, Walter FM, Thompson DJ, et al. Risk prediction models for colorectal cancer incorporating common genetic variants: a systematic review. Cancer Epidemiol Biomarkers Prev. 2019;28:1580–93.

    Article  PubMed  Google Scholar 

  18. Huyghe JR, Bien SA, Harrison TA, Kang HM, Chen S, Schmit SL, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51:76–87.

    Article  CAS  PubMed  Google Scholar 

  19. Schmit SL, Edlund CK, Schumacher FR, Gong J, Harrison TA, Huyghe JR, et al. Novel common genetic susceptibility loci for colorectal cancer. J Natl Cancer Inst. 2019;111:146–57.

    Article  PubMed  Google Scholar 

  20. Lu Y, Kweon SS, Tanikawa C, Jia WH, Xiang YB, Cai Q, et al. Large-scale genome-wide association study of East Asians identifies loci associated with risk for colorectal cancer. Gastroenterology. 2019;156:1455–66.

    Article  PubMed  Google Scholar 

  21. Law PJ, Timofeeva M, Fernandez-Rozadilla C, Broderick P, Studd J, Fernandez-Tajes J, et al. Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat Commun. 2019;10:2154.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Thomas M, Sakoda LC, Hoffmeister M, Rosenthal EA, Lee JK, van Duijnhoven FJB, et al. Genome-wide modeling of polygenic risk score in colorectal cancer risk. Am J Hum Genet. 2020;107:432–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Hsu L, Jeon J, Brenner H, Gruber SB, Schoen RE, Berndt SI, et al. A model to determine colorectal cancer risk using common genetic susceptibility loci. Gastroenterology. 2015;148:1330–9.

    Article  PubMed  Google Scholar 

  24. Frampton MJ, Law P, Litchfield K, Morris EJ, Kerr D, Turnbull C, et al. Implications of polygenic risk for personalised colorectal cancer screening. Ann Oncol. 2016;27:429–34.

    Article  CAS  PubMed  Google Scholar 

  25. Mur P, Bonifaci N, Díez-Villanueva A, Munté E, Alonso MH, Obón-Santacana M, et al. Non-lynch familial and early-onset colorectal cancer explained by accumulation of low-risk genetic variants. Cancers. 2021;13:3857.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Frampton M, Houlston RS. Modeling the prevention of colorectal cancer from the combined impact of host and behavioral risk factors. Genet Med. 2017;19:314–21.

    Article  PubMed  Google Scholar 

  27. Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50:1219–24.

    Article  CAS  PubMed  Google Scholar 

  28. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Yang H, Wang K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat Protoc. 2015;10:1556–66.

    Article  CAS  PubMed  Google Scholar 

  30. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–43.

    Article  CAS  PubMed  Google Scholar 

  31. Fahed AC, Wang M, Homburger JR, Patel AP, Bick AG, Neben CL, et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat Commun. 2020;11:3635.

    Article  CAS  PubMed  Google Scholar 

  32. Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980-985.

    Article  CAS  PubMed  Google Scholar 

  33. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7.

    Article  PubMed  Google Scholar 

  34. Khera AV, Chaffin M, Zekavat SM, Collins RL, Roselli C, Natarajan P, et al. Whole-genome sequencing to characterize monogenic and polygenic contributions in patients hospitalized with early-onset myocardial infarction. Circulation. 2019;139:1593–602.

    Article  CAS  PubMed  Google Scholar 

  35. Briggs SEW, Law P, East JE, Wordsworth S, Dunlop M, Houlston R, et al. Integrating genome-wide polygenic risk scores and non-genetic risk to predict colorectal cancer diagnosis using UK Biobank data: population based cohort study. BMJ. 2022;379:e071707.

    Article  PubMed  Google Scholar 

  36. Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018;19:581–90.

    Article  CAS  PubMed  Google Scholar 

  37. Jenkins MA, Makalic E, Dowty JG, Schmidt DF, Dite GS, MacInnis RJ, et al. Quantifying the utility of single nucleotide polymorphisms to guide colorectal cancer screening. Future Oncol. 2016;12:503–13.

    Article  CAS  PubMed  Google Scholar 

  38. Jia G, Lu Y, Wen W, Long J, Liu Y, Tao R, et al. Evaluating the utility of polygenic risk scores in identifying high-risk individuals for eight common cancers. JNCI Cancer Spectr. 2020;4:21.

    Article  Google Scholar 

  39. Archambault AN, Su YR, Jeon J, Thomas M, Lin Y, Conti DV, et al. Cumulative burden of colorectal cancer-associated genetic variants is more strongly associated with early-onset vs late-onset cancer. Gastroenterology. 2020;158:1274–86.

    Article  CAS  PubMed  Google Scholar 

  40. Fuchs CS, Giovannucci EL, Colditz GA, Hunter DJ, Speizer FE, Willett WC. A prospective study of family history and the risk of colorectal cancer. N Engl J Med. 1994;331:1669–74.

    Article  CAS  PubMed  Google Scholar 

  41. Rex DK, Boland CR, Dominitz JA, Giardiello FM, Johnson DA, Kaltenbach T, et al. Colorectal cancer screening: recommendations for physicians and patients from the U.S. Multi-Society Task Force on Colorectal Cancer. Am J Gastroenterol. 2017;112:1016–30.

    Article  PubMed  Google Scholar 

  42. Wijnen JT, Brohet RM, van Eijk R, Jagmohan-Changur S, Middeldorp A, Tops CM, et al. Chromosome 8q23.3 and 11q23.1 variants modify colorectal cancer risk in Lynch syndrome. Gastroenterology. 2009;136:131–7.

    Article  PubMed  Google Scholar 

  43. Talseth-Palmer BA, Wijnen JT, Brenne IS, Jagmohan-Changur S, Barker D, Ashton KA, et al. Combined analysis of three Lynch syndrome cohorts confirms the modifying effects of 8q23.3 and 11q23.1 in MLH1 mutation carriers. Int J Cancer. 2013;132:1556–64.

    Article  CAS  PubMed  Google Scholar 

  44. Hassanin E, May P, Aldisi R, Spier I, Forstner AJ, Nöthen MM, et al. Breast and prostate cancer risk: the interplay of polygenic risk, rare pathogenic germline variants, and family history. Genet Med. 2022;24:576–85.

    Article  CAS  PubMed  Google Scholar 

  45. Mars N, Widén E, Kerminen S, Meretoja T, Pirinen M, Della Briotta Parolo P, et al. The role of polygenic risk and susceptibility genes in breast cancer over the course of life. Nat Commun. 2020;11:6383.

    Article  CAS  PubMed  Google Scholar 

  46. Gao C, Polley EC, Hart SN, Huang H, Hu C, Gnanaolivu R, et al. Risk of breast cancer among carriers of pathogenic variants in breast cancer predisposition genes varies by polygenic risk score. J Clin Oncol. 2021;39:2564–73.

    Article  CAS  PubMed  Google Scholar 

  47. Jenkins MA, Win AK, Dowty JG, MacInnis RJ, Makalic E, Schmidt DF, et al. Ability of known susceptibility SNPs to predict colorectal cancer risk for persons with and without a family history. Fam Cancer. 2019;18:389–97.

    Article  CAS  PubMed  Google Scholar 

  48. Biller LH, Horiguchi M, Uno H, Ukaegbu C, Syngal S, Yurgelun MB. Familial burden and other clinical factors associated with various types of cancer in individuals with lynch syndrome. Gastroenterology. 2021;161:143–50.

    Article  PubMed  Google Scholar 

  49. Kachuri L, Graff RE, Smith-Byrne K, Meyers TJ, Rashkin SR, Ziv E, et al. Pan-cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction. Nat Commun. 2020;11:6084.

    Article  CAS  PubMed  Google Scholar 

  50. Jenkins MA, Buchanan DD, Lai J, Makalic E, Dite GS, Win AK, et al. Assessment of a polygenic risk score for colorectal cancer to predict risk of lynch syndrome colorectal cancer. JNCI Cancer Spectr. 2021;5:22.

    Article  Google Scholar 

  51. Choi J, Jia G, Wen W, Long J, Shu XO, Zheng W. Effects of screenings in reducing colorectal cancer incidence and mortality differ by polygenic risk scores. Clin Transl Gastroenterol. 2021;12:e00344.

    Article  PubMed  Google Scholar 

  52. Stanesby O, Jenkins M. Comparison of the efficiency of colorectal cancer screening programs based on age and genetic risk for reduction of colorectal cancer mortality. Eur J Hum Genet. 2017;25:832–8.

    Article  PubMed  Google Scholar 

  53. Naber SK, Kundu S, Kuntz KM, Dotson WD, Williams MS, Zauber AG, et al. Cost-effectiveness of risk-stratified colorectal cancer screening based on polygenic risk: current status and future potential. JNCI Cancer Spectr. 2020;4:86.

    Article  Google Scholar 

  54. Monahan KJ, Bradshaw N, Dolwani S, Desouza B, Dunlop MG, East JE, et al. Guidelines for the management of hereditary colorectal cancer from the British Society of Gastroenterology (BSG)/Association of Coloproctology of Great Britain and Ireland (ACPGBI)/United Kingdom Cancer Genetics Group (UKCGG). Gut. 2020;69:411–44.

    Article  CAS  PubMed  Google Scholar 

  55. Seppälä TT, Dominguez-Valentin M, Sampson JR, Møller P. Prospective observational data informs understanding and future management of Lynch syndrome: insights from the Prospective Lynch Syndrome Database (PLSD). Fam Cancer. 2021;20:35–9.

    Article  PubMed  Google Scholar 

  56. Tyrrell J, Zheng J, Beaumont R, Hinton K, Richardson TG, Wood AR, et al. Genetic predictors of participation in optional components of UK Biobank. Nat Commun. 2021;12:886.

    Article  CAS  PubMed  Google Scholar 

  57. Saunders CL, Kilian B, Thompson DJ, McGeoch LJ, Griffin SJ, Antoniou AC, et al. External validation of risk prediction models incorporating common genetic variants for incident colorectal cancer using UK biobank. Cancer Prev Res. 2020;13:509–20.

    Article  Google Scholar 

  58. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51:584–91.

    Article  CAS  PubMed  Google Scholar 

Download references


UK Biobank analyses were conducted via application 52446 using a protocol approved by the Partners HealthCare Institutional Review Board. CM and EH are supported by the BONFOR-program of the Medical Faculty, University of Bonn (O-147.0002). This study was supported (not financially) by the European Reference Network on Genetic Tumour Risk Syndromes (ERN GENTURIS)—Project ID No 739547. ERN GENTURIS is partly co-funded by the European Union within the framework of the Third Health Programme “ERN-2016—Framework Partnership Agreement 2017–2021”. DRB and PM are supported by the FNR INTER INTER/DFG/21/16394868. This research was also supported by the Instituto de Salud Carlos III and co-funded by European Social Fund—ESF investing in your future—(grants CM19/00099 and PID2019-111254RB-I00) and from the European Union’s Horizon 2020 research and innovation program under the EJP RD COFUND-EJP Nº 825575.


Open Access funding enabled and organized by Projekt DEAL. No funding was obtained for this study.

Author information

Authors and Affiliations



EH performed the statistical analysis and the bioinformatics. EH, IS, DRB, CM and SA conceived and designed the study. EH, IS, CM, and SA drafted the initial manuscript. RA, HK, FD, ND, RH, CP, JB, GC, MMN, AJF, AM, PK, and PM performed the critical expert revision. PK, PM, CM, and SA supervised the study. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Stefan Aretz.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing Interests

No potential conflicts (financial, professional, or personal) relevant to the manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

A list of the considered variants and annotations.

Additional file 2.

Supplemental Figures 1–9 and Supplemental Tables 1–7.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hassanin, E., Spier, I., Bobbili, D.R. et al. Clinically relevant combined effect of polygenic background, rare pathogenic germline variants, and family history on colorectal cancer incidence. BMC Med Genomics 16, 42 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: