Modifiable lifestyle factors and severe COVID-19 risk: a Mendelian randomisation study

Background Lifestyle factors including obesity and smoking are suggested to be correlated with increased risk of COVID-19 severe illness or related death. However, whether these relationships are causal is not well known; neither for the relationships between COVID-19 severe illness and other common lifestyle factors, such as physical activity and alcohol consumption. Methods Genome-wide significant genetic variants associated with body mass index (BMI), lifetime smoking, physical activity and alcohol consumption identified by large-scale genome-wide association studies (GWAS) of up to 941,280 individuals were selected as instrumental variables. Summary statistics of the genetic variants on severe illness of COVID-19 were obtained from GWAS analyses of up to 6492 cases and 1,012,809 controls. Two-sample Mendelian randomisation analyses were conducted. Results Both per-standard deviation (SD) increase in genetically predicted BMI and lifetime smoking were associated with about two-fold increased risks of severe respiratory COVID-19 and COVID-19 hospitalization (all P < 0.05). Per-SD increase in genetically predicted physical activity was associated with decreased risks of severe respiratory COVID-19 (odds ratio [OR] = 0.19; 95% confidence interval [CI], 0.05, 0.74; P = 0.02), but not with COVID-19 hospitalization (OR = 0.44; 95% CI 0.18, 1.07; P = 0.07). No evidence of association was found for genetically predicted alcohol consumption. Similar results were found across robust Mendelian randomisation methods. Conclusions Evidence is found that BMI and smoking causally increase and physical activity might causally decrease the risk of COVID-19 severe illness. This study highlights the importance of maintaining a healthy lifestyle in protecting from COVID-19 severe illness and its public health value in fighting against COVID-19 pandemic.


Background
Obesity and smoking are well-known health-related lifestyle factors. Studies have reported the correlations between obesity and severe illness or related death of COVID-19 [1][2][3]. For smoking, its relationship with the risk of severe COVID-19 is controversial: some studies suggest that current smokers might have a lower risk [2,[4][5][6]. However, the Centre for Disease Control and Prevention suggests that both people with obesity and smoking are at increased risk of COVID-19 severe illness [7]. The relationships mentioned above are mainly suggested by observational studies, which of themselves are subject to bias, provide low level of evidence and have limited ability in supporting causality. For other lifestyle factors, such as physical activity and alcohol consumption, little is known about if they are associated with severe illness of COVID-19. A study has found that accelerometer-measured physical activity was associated with decreased likelihood of being a COVID-19 patient in the UK Biobank; no association was found for accelerometer-measured physical activity with the likelihood of death related to COVID-19, or for self-reported physical activity with the likelihoods of being a COVID-19 patient or of death related to COVID-19 [8].
Mendelian randomisation (MR) uses exposure-associated genetic variants as instrumental variables to assess the causality between exposures and outcomes [9]. As genetic variants are randomly allocated at conception, MR resembles a randomised controlled trial and is less subject to confounding than observational studies. The publicly available genome-wide association studies (GWAS) summary statistics provide valuable resources for assessing the causality between lifestyle factors and the risk of COVID-19 severe illness.
MR studies have been conducted to investigate the causality of body mass index (BMI), smoking and physical activity with COVID-19 risk. Inconsistent results have been found for BMI [8,10,11], a causal effect with limited precision has been found for smoking [10], and no evidence of causality was found for physical activity [8]. However, the sample sizes of these studies could have limited the abilities of these studies in detecting causal evidence and obtaining precise causal effect estimates.
This study aimed to investigate the causality between four lifestyle factors, namely BMI, smoking, physical activity and alcohol consumption, and severe illness of COVID-19 using a large sample size and the two-sample MR approach [12].

COVID-19 data source
Summary-level data were obtained from two GWAS analyses conducted by the COVID-19 Host Genetic Initiative [13] (Release 4 in September 2020): (1) 2972 very severe respiratory confirmed COVID-19 cases, which were defined as hospitalized laboratory confirmed SARS-CoV-2 infection (RNA and/or serology based) with death or respiratory support, and hospitalization with COVID-19 as primary reason for admission, compared with 284,472 population controls; and (2) 6492 hospitalized confirmed COVID-19 cases, which were defined as hospitalized laboratory confirmed SARS-CoV-2 infection (RNA and/or serology based) and hospitalization due to corona-related symptoms, compared with 1,012,809 controls. The majority (≥ 90%) of the participants included in the GWAS analyses were of European ancestry. Details of the GWAS analyses can be found at https ://www.covid 19hg.org/.

Genetic instrumental variables to lifestyle factors
Genome-wide significant genetic variants identified from GWAS were selected as instrumental variables for the investigated lifestyle factors. This study only used independent sentinel genetic variants found from primary analysis of relevant GWAS; genetic variants identified from conditional analysis, i.e., secondary signals, were not used.
• BMI: 656 variants with P < 10 -8 (Additional file 1: Supplementary  Table 4) variants of BMI, physical activity and alcohol consumption, respectively, that were unavailable in the COVID-19 data sources. One alcohol consumption variant had no proxy available, so it was not included in analysis.

Statistical analyses
The statistical power was calculated using the proportion of variation in the lifestyle factor explained by the genetic instrumental variables, the sample sizes of the COVID-19 GWAS, and the method proposed by Burgess [18]. From the formula (12) of Burgess, the causal effect size β, log-OR per-standard deviation (SD) increase in the genetically predicted lifestyle factor, can be detected with 80% statistical power at the significance level of 0.05 is where Φ −1 is the inverse of the cumulative distribution function of the standard normal distribution, ρ GX 2 is the proportion of variation in the lifestyle risk factor explained by the genetic instrumental variables, N is the sample size of the COVID-19 GWAS, and ℙ(Y = 1) and ℙ(Y = 0) are the proportions of cases and controls, respectively, in the COVID-19 GWAS. The values of the parameters used in the calculation for each lifestyle factor can be found in Additional file 2: Supplementary Table 5.
The main analyses were performed using inverse-variance weighted (IVW) method under a random-effects model [19], which assumes that all genetic variants are valid instrumental variables, or any horizontal pleiotropy, i.e., genetic variants were associated with COVID-19 risk through pathways other than the investigated lifestyle factor, must be balanced. The reported odds ratios (ORs) on COVID-19 risk were for per-SD increase in the genetically predicted value in BMI, lifetime smoking measure, accelerometer-measured physical activity and alcohol consumption (log-transformed standard drinks per week).
Leave-one-out analyses, i.e., applying IVW after removing each genetic variant in turn, were performed to assess if the results were driven by any single variant. If there were any, relevant genetic variants were removed and ORs were estimated again.
Sensitivity analyses were performed using MR-Egger regression [20], weighted median method [21] and weighted mode method [22], which relax MR assumptions and allow some genetic instrumental variables to be invalid, but are less powerful than IVW method. MR-Egger regression can provide consistent causal effect estimates even all genetic instrumental variables are invalid, while weighted median and weighted mode methods can provide consistent causal effect estimates when up to 50% genetic instrumental variables are invalid. The more consistency across the point estimates of the methods, the greater the evidence supporting the causal effect of the investigated lifestyle factor on COVID-19 severe illness. The overall directional pleiotropy of the genetic instrumental variables, i.e., unbalanced horizontal pleiotropy, can be detected by the intercept of the MR-Egger regression, which is expected to be different from zero if directional pleiotropy exists. The analyses were conducted using the TwoSampleMR R package [23]. All statistical tests were two-sided. Results with a nominal P value < 0.05 were considered statistically significant.

Table 1 Odds ratios (OR) and 95% confidence intervals (CI) of the genetically predicted lifestyle factors with COVID-19 severe illness
OR and 95% CI were expressed as per standard deviation increase in genetically predicted levels in body mass index, lifetime smoking, accelerometer-measured physical activity and alcohol consumption (log-transformed standard drinks per week) 0.74; P = 0.02), but not with COVID-19 hospitalization (per-SD OR = 0.44; 95% CI 0.18, 1.07; P = 0.07), though most of the 95% CI did not include one. No evidence of association was found for genetically predicted alcohol consumption with severe respiratory COVID-19 or COVID-19 hospitalization (both P > 0.1). There was evidence of heterogeneity between the genetic variants of BMI and alcohol consumption, respectively, but not between those of lifetime smoking or physical activity ( Table 2).
From the leave-one-out analyses, similar OR estimates were found for BMI (Additional files 4, 5: Supplementary Figs. 1 and 2), lifetime smoking (Additional files 6, 7: Supplementary Figs. 3 and 4) and physical activity (Additional files 8, 9: Supplementary Figs. 5 and 6), suggesting that the observed associations for these three genetically predicted lifestyle factors were unlikely to be driven by any single genetic variant outlier. Genetic variants rs1229984 and rs2532276 appeared to be substantially influence the OR estimates in the analyses for alcohol consumption with both severe respiratory COVID-19 (Additional file 10: Supplementary Fig. 7) and COVID-19 hospitalization (Additional file 11: Supplementary Fig. 8). After removing the two variants, genetically predicted alcohol consumption had a per-SD OR = 0.64 (95% CI 0.25, 1.64; P = 0.35) with severe respiratory COVID-19, and a per-SD OR = 1.11 (95% CI 0.57, 2.17; P = 0.76) with COVID-19 hospitalization (Table 1). Although weak evidence of heterogeneity between the remaining genetic variants were still found (Table 2), the leave-one-out analyses did not suggest there were significant outliers left (Additional files 12, 13: Supplementary Figs. 9 and  10). The two variants were removed from subsequent analyses.
Although there was evidence of heterogeneity between the genetic variants of BMI (Table 2), the tests of MR-Egger regression intercepts suggested there was no evidence of overall directional pleiotropy of the genetic variants on COVID-19 hospitalization (P = 0.21), and a weak evidence in the analysis for severe respiratory COVID-19 (Intercept = 0.009; 95% CI 0.001, 0.018; P = 0.02; Table 3). These were also supported by the MR funnel plots, which was generally symmetrical for COVID-19 hospitalization and little asymmetrical for severe respiratory COVID- 19 Figs. 17 and 18). From the sensitivity analyses, causal effect estimates with consistent directions were found across the MR-Egger regression, weighted median method and weighted mode method for all the investigated lifestyle factors, though some estimates were with wider 95% CIs (Fig. 1).

Discussion
Using a two-sample MR approach, we found evidence that BMI has a causal effect on increased risk of COVID-19 severe illness, same as the findings by Ponsford et al. [10] and Leong et al. [11], but different from those by Zhang et al. [8]. Our causal effect estimates were of greater precision, in terms of the width of the confidence interval (log-OR scale), than those found by Ponsford et al., e.g., the per-SD OR on COVID-19 hospitalization estimated by Ponsford et al. was 1.47 (95% CI 1.18, 1.83) with a confidence interval width of 0.44, while we estimated it to be 1.75 (95% CI 1.52, 2.01) with a confidence interval width of 0.28. The difference is likely to be due to that we used the summary statistics from COVID-19 GWAS analyses of larger sample sizes; Ponsford et al. used the summary statistics from a GWAS of severe COVID-19 with respiratory failure including 1610 cases and 2205 controls [24], and the Release 3 data from the COVID-19 Host Genetic Initiative including 3199 cases and 897,488 controls in the GWAS of COVID-19 hospitalization. Leong et al. [11] also used the Release 3 data from the COVID-19 Host Genetic Initiative but reported the OR as per unit increase in BMI, e.g., 1.12 (95% CI 1.03, 1.13) for COVID-19 hospitalization; therefore, their results cannot be directly compared with ours. The MR study by Zhang et al. [8] reported null associations between genetically predicted BMI and COVID-19 outcomes including being a COVID-19 patient and death related to COVID-19, different from the outcomes investigated in our study. Zhang et al. used the UK Biobank data, the sample size of which might not provide sufficient statistical power to detect an association. Fig. 1 Odds ratios (OR) and 95% confidence intervals (CI) of the genetically predicted lifestyle factors with COVID-19 severe illness across Mendelian randomisation methods. OR and 95% CI were expressed as per standard deviation increase in genetically predicted levels in body mass index (BMI), lifetime smoking measure, accelerometer-measured physical activity and alcohol consumption (log-transformed standard drinks per week). The plots were right-truncated to better present the confidence intervals The finding for BMI is consistent with the correlation between obesity and severe illness observed in COVID-19 patients [1][2][3]. Obesity is plausible to contribute to COVID-19 severe illness. Featured with increased macrophage infiltration that associated with abnormal production of pro-inflammatory cytokines and insulin secretory, obesity contributes to systemic immune dysregulation [25][26][27], which could contribute to abnormal immune response to the coronavirus SARS-CoV-2 that results severe illness [28].
We found evidence that lifetime smoking has a causal effect on increased risk of COVID-19 severe illness, same as the findings by Ponsford 3.03) with a confidence interval width of 0.69. In COVID-19 patients, smoking has not been found to be associated with the risk of death [2], and lower than expected prevalence of smoking has even been observed [4][5][6]. However, these studies considered smoking as a binary (current, noncurrent smokers) or categorical (current, former, never smokers) variable only without any consideration on smoking heaviness or duration. In addition, there might be misclassification in the collected smoking behaviours for COVID-19 patients, especially in emergency contexts. The lifetime smoking measure used by our study is more accurate to reflect the cumulative exposure to smoking, and it has been validated to be associated with lung cancer and coronary heart disease through MR analyses [15].
For the first time, we provided evidence that physical activity causally decreases the risk of severe respiratory COVID-19. Physical activity studied in this study is measured by accelerometer [16]; therefore, it is more accurate than self-reported one. The MR study by Zhang et al. [8] using UK Biobank data reported a null association for either accelerometer-measured or self-reported physical activity; however, similar to their analyses for BMI, they studied different COVID-19 outcomes and the sample size might not be sufficient. Nevertheless, only five genetic instrumental variables were used by our study, and they explained ~ 0.2% variation in accelerometer-measured physical activity only; the estimates for the causal effects were not of great precision. Furthermore, although the MR-Egger regression intercept was not different from zero, it had a considerable magnitude with a suggestion that there might be horizontal pleiotropy of the physical activity genetic instrumental variables. The association could be driven by BMI, as there was a negative genetic correlation between physical activity and BMI [16].
For alcohol consumption, although we found evidence that there were two outlier genetic variants influencing the association estimates using all genetic variants and removed the two variants, null associations were found. The null associations suggest that alcohol consumption might not change the risk of COVID-19 severe illness. However, the null associations might also be due to statistical power; our study did not have sufficient power to detect small effect sizes as the observed ORs of 1.64 and 1.11 with severe respiratory COVID-19 and COVID-19 hospitalization, respectively.
BMI, smoking and physical activity are modifiable, so they could be targeted to reduce severe illness of COVID-19. This study highlights the importance of maintaining a healthy lifestyle in protecting from COVID-19 severe illness. The findings also have a profound public health value-a healthy lifestyle could be helpful for fighting against the COVID-19 pandemic.
Limitations of this study included that there might be bias in the causal effect estimates, as there was some sample overlapping between the lifestyle factors GWAS and COVID-19 GWAS, e.g., UK biobank participants were included in the GWAS of COVID-19 hospitalization. However, the bias might not be substantial, because UK Biobank contributed 413 cases (6.4% of the total cases) only to and 99.9% UK Biobank participants were controls in the GWAS of COVID-19 hospitalization (no UK Biobank participants were included in the GWAS of severe respiratory COVID-19; see Additional file 3: Supplementary Table 6), with an implication that the vast majority of the overlapping samples were likely to be controls in the GWAS of COVID-19; an unbiased causal effect estimate is expected if the associations between risk factor and genetic instrumental variables are obtained from controls [29]. Another limitation is that the findings might not be applicable to populations of non-European ancestry, as the exposure and outcome summary statistics for the genetic instrumental variables were from GWAS of participants almost entirely of European ancestry. Our main analysis included eight statistical tests, but we did not perform multiple testing adjustment; however, the five associations for BMI, lifetime smoking and physical activity with nominal P < 0.05 were still significant even assessed using the false discovery rate (FDR; all FDR < 0.04).

Conclusions
This two-sample MR study finds evidence that BMI and smoking causally increase and physical activity might causally decrease the risk of COVID-19 severe illness,