Estimating causal effects of atherogenic lipid-related traits on COVID-19 susceptibility and severity using a two-sample Mendelian randomization approach

Background As the number of COVID-19 deaths continues to rise worldwide, the identification of risk factors for the disease is an urgent issue, and it remains controversial whether atherogenic lipid-related traits including serum apolipoprotein B, low-density lipoprotein (LDL)-cholesterol, and triglyceride levels, are risk factors. The aim of this study was to estimate causal effects of lipid-related traits on COVID-19 risk in the European population using a two-sample Mendelian randomization (MR) approach. Methods We used summary statistics from a genome-wide association study (GWAS) that included 441,016 participants from the UK Biobank as the exposure dataset of lipid-related traits and from COVID-19 Host Genetics Initiative GWAS meta-analyses of European ancestry as the outcome dataset for COVID-19 susceptibility (32,494 cases and 1,316,207 controls), hospitalization (8316 cases and 1,549,095 controls), and severity (4792 cases and 1,054,664 controls). We performed two-sample MR analyses using the inverse variance weighted (IVW) method. As sensitivity analyses, the MR-Egger regression, weighted median, and weighted mode methods were conducted as were leave-one-out sensitivity analysis, the MR-PRESSO global test, PhenoScanner searches, and IVW multivariable MR analyses. A P value below 0.0055 with Bonferroni correction was considered statistically significant. Results This MR study suggested that serum apolipoprotein B or LDL-cholesterol levels were not significantly associated with COVID-19 risk. On the other hand, we inferred that higher serum triglyceride levels were suggestively associated with higher risks of COVID-19 susceptibility (odds ratio [OR] per standard deviation increase in lifelong triglyceride levels, 1.065; 95% confidence interval [CI], 1.001–1.13; P = 0.045) and hospitalization (OR, 1.174; 95% CI, 1.04–1.33; P = 0.012), and were significantly associated with COVID-19 severity (OR, 1.274; 95% CI, 1.08–1.50; P = 0.004). Sensitivity and bidirectional MR analyses suggested that horizontal pleiotropy and reverse causation were unlikely. Conclusions Our MR study indicates a causal effect of higher serum triglyceride levels on a greater risk of COVID-19 severity in the European population using the latest and largest GWAS datasets to date. However, as the underlying mechanisms remain unclear and our study might be still biased due to possible horizontal pleiotropy, further studies are warranted to validate our findings and investigate underlying mechanisms. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-021-01127-2.


Background
The World Health Organization has reported that the number of deaths from coronavirus disease 2019  caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continues to rise worldwide (over 4.6 million as of September 2021) [1]. Therefore, in addition to the establishment of effective therapies, the identification of risk factors for COVID-19 is an urgent issue. Observational studies have reported that the severity of COVID-19 depends on risk factors such as obesity, coronary artery disease (CAD), and diabetes [2,3]. Although dyslipidemia is associated with these risk factors [4][5][6], different results from observational studies have been reported with regard to an association between dyslipidemia and COVID-19 risk [7][8][9][10][11]. Moreover, observational studies tend to suffer from bias due to possible confounders and reverse causation [12].
The Mendelian randomization (MR) method mimics a study with a randomized controlled design using singlenucleotide variants (SNVs) (also called single-nucleotide polymorphisms [SNPs]) as instrumental variables (IVs) and can estimate causal effects of risk factors on diseases of interest. According to Mendel's law, genetic variants are randomly assigned at meiosis. Therefore, MR studies are less likely to suffer from possible confounders or reverse causation, which as stated above are limitations of observational studies [12]. A recent genome-wide association study (GWAS) using UK Biobank (UKBB) data and MR analysis [13] reported that among atherogenic lipid-related traits (apolipoprotein B [Apo-B], lowdensity lipoprotein cholesterol [LDL-C], and triglyceride [TG]), Apo-B accounted for the causal effect on CAD risk, independently of LDL-C and TGs. Nevertheless, the estimated effects of atherogenic lipid-related traits on COVID-19 risk are inconsistent even among MR studies [14][15][16][17]. The aim of the present study was to estimate causal effects of serum Apo-B, LDL-C, and TG levels on risk of COVID-19 susceptibility, hospitalization, and severity in the European population using a two-sample MR approach.

Study design
We performed two-sample univariable MR analyses using summary-level GWAS datasets to estimate causal effects of circulating atherogenic lipid-related traits on COVID-19 using genetically predicted serum Apo-B, LDL-C, and TG levels as exposures and risk of COVID-19 susceptibility, hospitalization, and severity as outcomes. To examine reverse causation, we also performed bidirectional twosample univariable MR analyses using genetically predicted risk of COVID-19 susceptibility, hospitalization, and severity as exposures and serum Apo-B, LDL-C, and TG levels as outcomes.
All analyses were conducted using the TwoSampleMR package (version 0.5.6) in R software (version 4.0.3) [18].
A P value below 0.0055 (0.005/3/3 by Bonferroni correction) was considered statistically significant and a P value between 0.0055 and 0.05 was considered suggestively significant in the MR analyses.

Data sources
For the exposure dataset of genetically predicted serum Apo-B, LDL-C, and TG levels, summary statistics were available from a GWAS [13] that included up to 441,016 participants from UKBB. To our knowledge, this is to date the largest GWAS in sample size of atherogenic lipid-related traits. The data were standardized and normalized such that the mean was 0 and the standard deviation (SD) was 1. However, the mean and SD of TG were not provided in the GWAS; instead, the median TG was given as 1.50 (IQR = 1.11) mmol/L. The mean (SD) TG of 470,434 participants in UKBB was 1.75 (1.02) mmol/L [19]. The GWAS datasets are publicly available from the IEU open GWAS database [20] as GWAS-IDs of "ieu-b-108" for serum Apo-B level, "ieu-b-110" for serum LDL-C level, and "ieu-b-111" for serum TG level.
For the outcome dataset of genetically predicted risk of COVID-19 susceptibility, hospitalization, and severity, summary statistics were available from the GWAS meta-analyses by COVID-19 Host Genetics Initiative (COVID-19-HGI) [21] (Round 5) of European ancestry, which excluded UKBB data. Therefore, in our twosample MR study, possible study overlap between the exposure and outcome datasets was unlikely. The outcome dataset included 1,348,701 participants (32,494 laboratory confirmed cases of SARS-CoV-2 infection and 1,316,207 population controls) for COVID-19 susceptibility, 1,557,411 participants (8316 hospitalized COVID-19 patients and 1,549,095 population controls) for COVID-19 hospitalization, and 1,059,456 participants (4792 very severe respiratory confirmed COVID-19 cases and 1,054,664 controls) for COVID-19 severity. COVID-19-HGI defined very severe respiratory confirmed COVID-19 cases as patients hospitalized for laboratoryconfirmed SARS-CoV-2 infection who died or were given respiratory support. The dataset (Round 5) was the latest version with the largest sample size involving European ancestry released on January 18, 2021 [22].
For the dataset of genetically predicted body mass index (BMI) trait, summary statistics are publicly available from a GWAS of European ancestry (a meta-analysis of GIANT [Genetic Investigation of ANthropometric Traits] consortium studies and UKBB with a total of 681,275 participants) [23], and from the IEU open GWAS database [20] as GWAS-ID of "ieu-b-40".

Selection of instrumental variables
MR analyses use SNVs as IVs that must satisfy the following three assumptions [24]: the IVs are associated with the exposure; the IVs affect the outcome only via exposure; and the IVs are not associated with confounders.
To estimate causal effects, we selected SNVs from the exposure GWAS dataset as IVs by clumping together all SNVs associated with the trait with P < 5.0 × 10 -8 (a genome-wide significance level) and not in linkage disequilibrium with other SNVs (r 2 < 0.001, and distance > 10,000 kb) using the clump_data function (population = "EUR"). We extracted the summary statistics for each SNV from both the exposure and outcome GWAS datasets and then harmonized them. We did not include proxy SNVs in the analysis [25,26]. We excluded palindromic SNVs with an intermediate minor allele frequency (MAF) > 0.42 [24,27]. To evaluate the strength of the IVs, the F-statistic for each SNV was calculated using the following formula: where R 2 is the proportion of variance in phenotype explained by each SNV in exposure, and N is the sample size. We calculated R 2 using the following formula: where Beta is the per allele effect size of the association between each SNV and phenotype [28]. IVs with an F-statistic below 10 (if any existed) were considered weak instruments [29].

Two-sample Mendelian randomization and sensitivity analyses
The Wald ratio, which estimates the causal effect of each IV exposure on outcome, was calculated as the ratio of Beta for the corresponding SNV in the outcome dataset divided by Beta for the same SNV in the exposure dataset [24]. We conducted a meta-analysis of each Wald ratio using the inverse variance weighted (IVW) method as a main analysis to estimate the overall causal effect of genetically predicted values of the exposure on the outcome. For the IVW method, we used a multiplicative random-effects model when Cochran's Q statistic (as described below) was significant (P < 0.05) [30]; otherwise, a fixed-effects model was used. Based on IVW results, we inferred the causal effect of the lifelong change in exposure on the outcome [31].
In addition, we conducted sensitivity analyses using the MR-Egger regression method, the weighted median method, the weighted mode method, and leave-one-out sensitivity analysis. The MR-Egger regression method can detect horizontal pleiotropy. The MR-Egger intercept is nonzero with statistical significance (P < 0.05) if possible horizontal pleiotropy of IVs exists [32]. The weighted median method can generate a valid causal estimate if at least 50% of the instrument SNVs satisfy the IV assumptions [32]. The weighted mode method forms clusters of individual SNVs and estimates the causal effect from the largest cluster [32]. Leave-one-out sensitivity analysis removed each SNV from the IVW method and reestimated the causal effect to assess the reliability of the analysis [33]. We also measured heterogeneity among causal estimates across all SNVs in the IVW method by calculating Cochran's Q statistic and the corresponding P value. Low heterogeneity (P > 0.05) provides more reliability for causal effects [34]. Regarding MR to estimate the causal effect of TGs on COVID-19 severity, we added the following two analyses: we searched for SNVs associated with P < 5.0 × 10 -8 with pleiotropic effects on BMI and any type of leukocyte or inflammatory marker using the web tool PhenoScanner (version 2) [35,36] and then excluded them from the IVW method; we conducted the MR-PRESSO (Pleiotropy RESidual Sum and Outlier) global test with the run_mr_presso function to detect possible horizontal pleiotropy [37]. Moreover, we conducted IVW multivariable MR analyses with the mv_ ivw function to estimate the direct effects of genetically predicted TG levels on risk of COVID-19 independent of the effects of other exposure traits using genetically predicted Apo-B, LDL-C, or BMI traits as a covariate [13,[38][39][40][41].

Statistical power
We calculated statistical power in the MR analyses at a type-I error rate of 0.05 using the web tool mRnd [42,43], as shown in Additional file 1: Table S1. For example, we achieved 80% power to detect an odds ratio (OR) of 1.066 (or 0.934) for the causal effect of genetically predicted Apo-B levels on COVID-19 susceptibility.

Results
The characteristics of all SNVs included in our univariable MR analyses are shown in Additional file 1: Tables S2-S4. The F-statistic for every instrument was > 25, indicating no weak instrument bias. During harmonization, several SNVs were excluded because they were palindromic with intermediate MAFs ("palindromic ambiguous" was "TRUE" in Additional file 1: Tables S2-S4). The overall univariable MR results are shown in Table 1, Fig. 1, and Additional file 2: Figures S1-S3. None of the MR methods, including IVW, indicated any causal effect of genetically predicted Apo-B or LDL-C levels on risk of COVID-19, whereas some MR methods did indicate causal effects of genetically predicted TG levels (Table 1). By employing the IVW methods, we inferred that lifelong elevated TG levels had suggestive causal effects on a higher risk of COVID-19 susceptibility (OR per 1-SD   Leave-one-out sensitivity analysis (Additional file 2: Figure S3a) revealed the reliability of the IVW analysis, and the MR-PRESSO global test (P = 0.44) suggested a lack of possible horizontal pleiotropy. Furthermore, a funnel plot (Additional file 2: Figure S3b) depicted general symmetry, suggesting little evidence of heterogeneity or horizontal pleiotropy [32]. The weighted median and weighted mode methods also showed OR scales and directions consistent with the IVW method; however, the effects were not significant, raising the possibility of horizontal pleiotropy and confounding. Therefore, we performed PhenoScanner searches to identify SNVs associated with possible pleiotropic effects on other risk factors for COVID-19 at P < 5.0 × 10 -8 . Moreover, we carried out IVW multivariable MR analyses as sensitivity analyses to estimate the direct causal effect of genetically predicted TG levels on COVID-19 risk adjusted for each of the genetically predicted Apo-B, LDL-C, and BMI traits. The IVW multivariable MR results are shown in Table 2. The suggested causal effect of TGs on COVID-19 susceptibility was eliminated upon adjustment for each trait. We obtained comparable results regarding the causal effect of TGs on COVID-19 hospitalization and severity after adjustment for each trait; however, the latter had only suggestive significance.
Finally, we examined reverse causation by performing bidirectional two-sample univariable MR analyses using genetically predicted risks of COVID-19 as exposures and atherogenic lipid-related traits as outcomes. The results are shown in Additional file 3: Table S5. Overall, numbers of instrumental SNVs were low, and heterogeneities were high. We found that none of the COVID-19 risks had any significant causal effects on atherogenic lipid-related traits.

Discussion
From univariable MR studies, we inferred suggestive causal effects of lifelong higher TG levels on higher risk of COVID-19 susceptibility (OR, 1.065; 95% CI, 1.001-1.13; P = 0.045) and hospitalization (OR, 1.174; 95% CI, 1.04-1.33; P = 0.012) and a significant causal effect of TGs on COVID-19 severity (OR, 1.274; 95% CI, 1.08-1.50; P = 0.004) but could not find any causal effect of Apo-B or LDL-C levels. The suggested effect of TGs on COVID-19 susceptibility was eliminated and the significant effect on COVID-19 severity was attenuated with suggestive significance after adjustment each for Apo-B, Observational studies have reported contradictory results. A study among 9005 UKBB participants (1508 patients testing positive for SARS-CoV-2 and 7497 controls) reported that Apo-B, LDL-C, and TGs were not significantly associated with SARS-CoV-2 infection [7]. A systematic review and meta-analysis including 23 studies involving 10,122 COVID-19 patients showed that hospitalized patients with severe disease or non-survivor status had significantly lower serum LDL-C but not TG levels compared to patients with milder disease or survivor status; however, only a few studies of those with European ancestry were included [8]. A retrospective single-center study with 654 patients in Spain showed that LDL-C < 69 mg/dl at admission was independently associated with a greater risk of 30-day mortality from COVID-19 (hazard ratio, 1.94; 95% CI, 1.14-3.31, P = 0.014) [9]. However, a prospective single-center study with 48 COVID-19 patients in France did not detect a significant relationship between LDL-C and 28-day mortality [10]. A retrospective single-center study with 600 COVID-19 patients in the United States reported hypertriglyceridemia as being associated with mortality (OR, 2.3; 95% CI, 1.4-3.7; P = 0.001) independent of obesity, high CRP, and high leukocyte count [11]. Some observational studies of European-ancestry subjects were consistent with our MR results, though others were not. Such discrepancies may be due to the small sample size and retrospective and/or single-center designs of observational studies. Moreover, observational studies tend to suffer from bias by possible confounders and reverse causation [12]. For example, serum lipid levels might be lower as a result of poor nutrition status due to COVID-19 severity [8]. Associations between serum lipid levels and COVID-19 risks might be secondary to immuneinflammatory responses that could worsen COVID-19 outcomes [8,9,11], though the association of LDL-C with COVID-19 mortality was independent of inflammatory markers in the above Spanish study [9]. MR studies can overcome such limitations of observational studies [12]. Regardless, estimates of MR studies tend to be larger than those of observational studies, as the former estimate lifelong rather than short-term effects [31].
The underlying mechanisms by which hypertriglyceridemia worsens COVID-19 outcomes clinically and pathologically remain unclear. Although an MR study showed that genetically predicted lower counts of basophils and myeloid white blood cells had causal effects on COVID-19 severity [47], an observational study found positive correlations of higher leukocyte counts and CRP with TGs in COVID-19 patients and suggested that hypertriglyceridemia might have a direct effect on COVID-19 severity due to an enhanced inflammatory response [11]. In fact, studies suggested that hypertriglyceridemia could promote inflammation through leukocyte activation [48], macrophage accumulation in several organs [49] and increased sensitivity to cytokine stimulation of aortic endothelial cells [50]. Moreover, an MR study indicated causal effects of cardiometabolic exposures, including BMI and TGs, on circulating proteins that might contribute to severe COVID-19 [51]. For example, univariable MR indicated that both BMI and TGs had causal effects on reducing immunoglobulin G (IgG), a class of antibodies that help protect against infection. Nonetheless, multivariable MR indicated that BMI indirectly lowers IgG due to its influence on raising serum TG levels [51]. Therefore, we infer that hypertriglyceridemia may worsen COVID-19 at least partly through the direct causal effect of TGs on inflammatory responses.
There are several major limitations to be noted in the present MR study. First, our MR analyses estimating a causal effect of TGs on COVID-19 severity using the weighted median and weighted mode methods had directionally consistent results but no statistical significance. Although so did other MR analyses [15,37,52], we must pay attention to possible horizontal pleiotropy and confounding. Therefore, we attempted to exclude possible pleiotropic effects by performing Phenoscanner searches. Previous MR studies have indicated that BMI is a risk factor for COVID-19 severity [14,15,37,52], but CAD and diabetes are unlikely [14,53]. Therefore, we excluded SNVs associated with BMI as well as inflammatory responses from the IVW method. Moreover, we conducted IVW multivariable MR analyses to eliminate the effect of Apo-B, LDL-C, or BMI traits. Regardless, we obtained results comparable to those of original IVW methods estimating causal effects of TGs on COVID-19 hospitalization and/or severity. Although the multivariable MR results had only suggestive significance, Bonferroni correction can be considered overly conservative, given the high correlation between lipid-related traits [13]. Therefore, it is suggested that hypertriglyceridemia increases risk of COVID-19 hospitalization and severity to some extent independently of the effects of BMI, inflammatory responses, and other atherogenic lipidrelated traits. Second, as described in the Methods section and Additional file 1: Table S1, our MR analyses estimating the causal effects of Apo-B and LDL-C on risk of COVID-19 might not possess sufficient statistical power to detect, if any existed, a weak association. Third, an observational study found a U-shaped association between LDL-C and COVID-19 severity [54]. However, as two-sample MR analysis based on summary-level data assumes a linear relationship between exposure and outcome, we could not test for a nonlinear relationship between LDL-C and COVID-19 severity [38]. Fourth, our MR analysis was based on populations of European ancestry, and the findings are unlikely to be generalized to other populations. Fifth, the GWAS of lipid-related traits was conducted, including blood samples during a non-fasting status that might affect serum lipid or lipoprotein levels [13,55], which may have affected our causal estimate of TGs on the risks of COVID-19. Nonetheless, the GWAS and MR study adjusted for fasting time led to negligible changes in the effect estimates of Apo-B, LDL-C, and TGs on a higher risk of CAD [13].

Conclusions
Our two-sample MR approach indicated the causal effect of higher serum TG levels on a higher risk of COVID-19 severity in the European population using the most recent and largest GWAS datasets to date, suggesting that hypertriglyceridemia is a risk factor for COVID-19 severity. However, as the underlying mechanisms remain unclear and our MR study might be biased due to possible horizontal pleiotropy, further studies are warranted to validate our MR findings and investigate underlying mechanisms.