Association between STAT4 gene polymorphism and type 2 diabetes risk in Chinese Han population

Background Evidence from genetic epidemiology indicates that type 2 diabetes (T2D) has a strong genetic basis. Activated STAT4 has an inflammatory effect, and STAT4 is an important mediator of inflammation in diabetes. Our study aimed to study the association between STAT4 single nucleotide polymorphisms (SNPs) and T2D susceptibility in Chinese Han population. Methods We conducted a 'case–control' study among 500 T2D patients and 501 healthy individuals. 5 candidate STAT4 SNPs were successfully genotyped. The association between SNPs and T2D susceptibility under different genetic models was evaluated by logistic regression analysis. ‘SNP-SNP’ interaction was analyzed and completed by multi-factor dimensionality reduction (MDR). Finally, we evaluated the differences of clinical characteristics under different genotypes by one-factor analysis of variance. Results The overall results showed that STAT4 rs3821236 was associated with increasing T2D risk under allele (OR 1.23, p = 0.020), homozygous (OR 1.51, p = 0.025), dominant (OR 1.36, p = 0.029), and additive models (OR 1.23, p = 0.020). The results of stratified analysis showed that rs3821236, rs11893432, and rs11889341 were risk factors for T2D among participants ≤ 60 years old. Only rs11893432 was associated with increased T2D risk among female participants. There was also a potential association between rs3821236 and T2D with nephropathy risk. STAT4 rs11893432, rs7574865 and rs897200 were significantly associated with lysophosphatidic acid, cystatin C and thyroxine t4, respectively. Conclusion The genetic polymorphisms of STAT4 is potentially associated with T2D susceptibility of Chinese population. In particular, rs3821236 is significantly associated with T2D risk both in the overall and several subgroup analyses. Our study may provide new ideas for T2D individualized diagnosis/protection. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-021-01000-2.

second largest country in the world after India in terms of number of diabetic patients. It is estimated that the total number of diabetes patients in China will be close to 100 million by 2025 [4]. According to previous reports, it is generally believed that diabetes is often caused by the interaction of genetic and environmental factors resulting in insufficient insulin secretion. Evidence from genetic epidemiology indicates that the onset of type 2 diabetes has a strong genetic basis, and its genetic model belonged to polygenetics [5]. In recent years, with the development of molecular biology and molecular epidemiology and the improvement and application of gene detection technology, some genetic polymorphism loci associated with type 2 diabetes have been identified [6]. Up to now, T2D risk assessments have been conducted only in some populations. Therefore, it is still a difficult task to discover genetic polymorphism loci associated with T2D risk-among populations with different genetic backgrounds.
STAT4 is expressed in immunoregulatory cells such as monocytes, dendritic cells, and macrophages at the site of inflammation. STAT4 mainly induces Th1 responses and inhibits Th2responses [7,8]. Activated STAT4 is considered to have inflammatory effect, it plays an important role in the regulation of Th1/Th2 differentiation and the autoimmune diseases caused by this disorder. STAT4 is an important mediator of inflammation in immune cells and fat cells in diabetes and obesity [9]. More importantly, several studies have found Th1/Th2 cytokine imbalance in T2D patients [10][11][12], we speculate that STAT4 gene may play a potential role in the occurrence and development of type 2 diabetes. STAT4 genetic polymorphisms associated with the development of various diseases have been reported [13][14][15][16][17][18][19]. We did not find any reports on the association between STAT4 genetic polymorphisms and T2D risk.
Therefore, this study took the Chinese Han population as the research object and selected 5 candidate STAT4 SNPs (rs3821236 A/T, rs11893432 G/C, rs11889341 T/C, rs7574865 T/G and rs897200 C/T). Finally, we evaluated the association between STAT4 SNPs and T2D susceptibility. Our study may provide supplementary data for T2D risk assessment of specific population, and may also provide valuable reference for T2D individualized prevention.

Study objects and sample collection
After we fully obtained the consent of all participants, a total of 1001 Chinese Han people participated in this study (500 T2D patients and 501 healthy individuals with age and gender matched). Based on the genotyping results of all participants, we mainly used GCTA software (GCTA 1. 26.0) to perform principal component analysis (PCA) and construct a kinship matrix to evaluate the genetic relationship between participants in this study [20]. The specific operations are as follows: (1) Plink software (PLINK v1.90b6.12) was used to convert the file format of genotyping data, which is necessary for PCA construction through GCTA software. When performing PCA, we set pca = 4. Then we used R software (R4.0.3) to draw a scatter plot based on the file generated by GCTA. Finally, the genetic relationship between the participants was estimated according to the scatter plot.
(2) We used the Plink software to convert the file format of the genotyping data. The GCTA software was used to calculate the genetic relationship matrix (GRM). Finally, the kinship matrix heat map was drawn using R software, and the kinship relationship between participants was estimated according to the kinship coefficient.

Case group
The 500 diabetic patients come from the First Affiliated Hospital of Xi'an Jiaotong university. Among them, 142 female, accounting for 28.4%; 358 male, accounting for 71.6%. T2D inclusion criteria are as follows: (1) outpatients or inpatients of the First Affiliated Hospital of Xi'an Jiaotong University; (2) patients who have been clearly diagnosed as T2D or newly diagnosed patients with T2D (diagnostic criteria: fasting blood glucose ≥ 7.0 mmol/L/ OGTT 2 h blood glucose ≥ 11.1 mmol/L/random blood glucose ≥ 11.1 mmol/L); (3) the T2D patients have no history of major mental trauma, and no history of genetic diseases: such as history of malignant tumors. All research subjects gave informed consents.

Control group
The 501 controls were healthy individuals selected at the same time and place as the above case group. Among them, 143 were female, accounting for 28.5%; 358 were male, accounting for 71.5%. The controls were selected according to the following requirements: (1) healthy individuals undergoing physical examination in the same hospital outpatient department at the same time; (2) fasting venous plasma glucose value ≤ 6.1 mmol/L; (3) healthy individuals without complicated chronic diseases and surgical diseases, and tumor patients or people with tumor history are excluded; (4) the basic information (age and gender) of healthy individuals is not significantly different from the case group (excluding the difference in the distribution of exposure factors between case/control caused by confounding factors).
This study was conducted under the standard approved by the First Affiliated Hospital of Xi'an Jiaotong University. All participants took part in a questionnaire about demographic and anthropological information, such as: gender, height, weight, smoking, drinking, systolic blood pressure (SBP), diastolic blood pressure (DBP), and family history of diabetes etc.

Sample collection
We used vacuum blood collection tubes containing ethylenediaminetetraacetic acid (EDTA) to collect the fasting venous blood about 2 ml of all participants in the morning, then placed it in a refrigerator at − 20 °C to be stored until use.

DNA extraction
The whole genome DNA purification kit (GoldMag Co. Ltd. Xi'an, China) was used for this study, the specific experimental steps were shown in Additional file 1. The DNA was stored in the refrigerator at − 80 °C until use.

Selection of SNPs
The selection of SNPs should follow the principle that the allele frequency of this locus is ≥ 5% in the study population. We also calculated the successful genotyping rate (call rate) of each candidate SNPs, then filtered out the SNPs with call rate < 90%. Eliminating low-quality loci will help improve the reliability of the analysis results and reduce the false positive rate. According to the relevant literature and the data of STAT4 gene polymorphism in the database, we finally selected 5 sites of STAT4 gene for research (rs3821236 A/T, rs11893432 G/C, rs11889341 T/C, rs7574865 T/G and rs897200 C/T).

Genotyping
We use MassARRAY Assay Design software for primer design. And weused the MassARRAY system (Agena, San Diego, CA, USA) to genotype all SNPs.The Mas-sARRAY platform is based on the MALDI-TOF (Matrix-Assisted Laser Desorption/Ionization-Time of Flight) mass spectrometer, which has the characteristics of high throughput and cost-effectiveness. The iPLEX chemical method was used to generate SNP genotypes. The specific experimental steps are as follows: (1) The region targeted by multiplex analysis is amplified by PCR (catalog number 10500). (2) The PCR product is treated with shrimp alkaline phosphatase (SAP) to neutralize unincorporated nucleotides (Cat. No. #08040). (3) Then perform an extension reaction to extend the PCR fragment by one base to the SNP site (catalog number 10136). (4) Then use MALDI-TOF to measure the quality of the obtained extension fragments to obtain the spectra of different mass peaks used for multiple reactions. Eventually we will successfully complete the genotyping.

Quality control
In order to verify the repeatability of the experiment, 10% of the DNA samples were randomly selected for repeated testing, and the agreement rate of the experimental results was > 99%.

Statistical analyses
In this study, SPSS 17.0 statistical packages [21] was used to detect whether the SNPs of STAT4 conformed to Hardy-Weinberg equilibrium (HWE). After testing whether all candidate SNPs meet Hardy-Weinberg balance, the differences in the demographic characteristics (such as: age, gender, whether smoking, drinking, and BMI) of participants in this study were tested by the chisquare test/t-test (t test was used for continuous variables such as age, and whether the mean value has statistical difference between the case group and the control group; the chi-square test was used for categorical variables such as gender, and whether the frequency distribution was statistically different between the case group and the control group). The p value represents whether the result is statistically significant. The logistic regression model (Adjusted by gender and age) was used to analyze and calculate the odds ratio (OR) and 95% confidence interval (CI) to evaluate the association between STAT4 polymorphism and type 2 diabetes risk. The reason why the logistic regression analysis only adjusted by age and gender is because the age and gender data of all participants are complete (There were large missing data on 'BMI, drinking, smoking'), which will effectively remove the influence of confounding factors on the accuracy of the results. The value of OR represents the odds ratio. When OR 1, it means that the factor has no effect on the occurrence of the disease; when OR > 1, it is a risk factor; when OR < 1, it is a protective factor.
Using wild-type alleles as reference, SNPstats online tool software was used to estimate multiple genetic models (codominant, dominant, recessive, and log-additive models). We used multifactor dimensionality reduction (MDR) to assess 'SNP-SNP' interaction in diabetes risk. I We used one-way analysis of variance to assess the differences in clinical indicators between different genotypes (SPSS 17.0 statistical packages). All tests are two-sided tests, and p < 0.05 is considered statistically significant.

Sample introduction and collection
A total of 1,001 unrelated Chinese Han people participated in this study. We chose the 'case-control' experiment design type. The case group included 500 diabetic patients with an average age of 59.87 ± 12.87 years, the control group included 501 healthy individuals with an average age of 59.85 ± 9.34 years. It can be seen that there was no statistical difference in gender and age between the case group and the control group (Table 1). In addition, there was no statistical difference in smoking history or BMI between the control group and the case group, but the p-values were both closed to 0.05. And there was a very significant difference in drinking history.
Contributing to the above results might be the lack of sample data. The results of principal component analysis (Additional file 1: Fig. 1) and the kinship matrix heat map (Additional file 2: Fig. 2) can be seen that all participants can be considered to have no genetic relationship. The fasting blood glucose and urea content in the diabetes case group was higher than that in the control group, while the total cholesterol content was lower in the case group than in the control group. And the above indicators showed a significant difference between the two groups (p < 0.001), with statistical significance. The specific data information is summarized in Table 1.

Association between STAT4 polymorphism and type 2 diabetes risk
In this study, a total of 5 SNPs (rs3821236, rs11893432, rs11889341, rs7574865 and rs897200) were successfully genotyped. The call rate of all loci was more than 90% (Table 2), which will help to improve the reliability of the results. The detailed information of candidate SNPs is listed in Table 2. All candidate SNPs are in HWE (p > 0.05). And the minor allele frequency (MAF) of all candidate SNPs are greater than 5% in the test population. The analysis results of HaploReg show that 5 SNPs are regulated by various factors, such as promoter histone marks, enhancer histone marks, motifs changed, NHGRI/EBI GWAS hits, GRASP QTL hits, Selected eQTL hits, etc. This study used logistic regression (Adjusted by gender and age) to test the association between SNPs and diabetes risk under different genetic models.

Overall analysis
Comprehensive analysis of all data, the result showed ( Table 3) that among the 5 candidate SNPs in this study, only the rs3821236 polymorphism was associated with T2D risk, and the remaining four were not been found to be significantly associated with T2D risk (p > 0.05). Specifically, the results of this study showed that the genotype frequencies of rs3821236 (AA, AG and GG) in the case group were 22.6%, 50.6%, and 26.8%, while in the control group were 18.6%, 48.3%, and 33.1%, respectively. Among them, the allele (A vs. G, OR 1.23, CI 1.03-1.47, p = 0.020) and homozygous (AA vs. GG, OR 1.51, CI 1.05-2.15, p = 0.025) models were positively associated with increased risk of T2D. At the same time, we found that the rs3821236 polymorphism had a significant association with the increased risk of diabetes under dominant (GG vs. AA-AG, OR 1.36, CI 1.03-1.78, p = 0.029) and log-additive models (OR 1.23, CI 1.03-1.47, p = 0.020). Age and gender (Table 4) The study population was grouped according to age (60 years old as the dividing line) and gender (male and female) to analyze the association between genetic polymorphisms and T2Drisk in different subgroups. The rs3821236, rs11893432 and rs11889341 polymorphisms were positively associated with increased risk of T2D among participants aged ≤ 60 years. Specifically, rs3821236 polymorphism was associated with an increased risk of T2D in allele (A vs.

BMI (Table 5)
The subjects were grouped according to 'body mass index' to analyze the association between candidate SNPs and T2D risk. The results showed that STAT4 rs11889341 (Dominant: OR 1.63, p = 0.035) and rs7574865 (Heterozygote: OR 1.75, p = 0.021; Dominant: OR 1.65, p = 0.030) significantly increased T2D risk in participants with BMI ≤ 24. In participants with BMI > 24, we did not find any evidence associated with T2D risk. In spite of this, the T2D risk of participants with BMI > 24 in our study almost all showed an increasing trend.

Smoking and drinking (Table 6)
The results showed that when the participants were grouped according to smoking status (Yes/No) for association analysis, we did not find any statistically significant results. Except for rs897200, the STAT4 rs3821236 (

T2D complications (Table 7)
Finally, we grouped the case group according to whether they complicated with nephropathy or coronary heart disease (CHD) to evaluate the association between candidate SNPs and the risk of T2D complications. The results showed that (Table 7) only rs3821236 was potentially associated with the susceptibility to T2D complicated with nephropathy under heterozygous (p = 0.024) and dominant (p = 0.037) genetic models. At the same time, the results showed that the 5 candidate SNPs didn't associated with the susceptibility of T2D complicated with CHD.

Differences in clinical indicators under different genotypes
Finally, we also conducted an association study between the five candidate SNPs and clinical indicators s of T2D patients. The results showed ( Table 8) that the level of clinical indicators associated with the candidate SNPs in this study were cystatin C, lysophosphatidic acid, and thyroxine. Specifically, the STAT4 rs11893432 was associated with LPa (p = 0.021); rs7574865 was associated with CysC (p = 0.033); while rs897200 had been found that was associated with T4 (p = 0.010). And the above data are statistically significant.

Discussion
Type 2 diabetes is the result of the interaction of genetic and environmental factors. In recent years, the association between genetic polymorphisms and diseases has been the focus of attention. Studies have found that STAT4 mainly induces Th1 response and inhibits Th2 response [7,8]. It plays an important role in the regulation of Th1/Th2 differentiation and the autoimmune diseases caused by this disorder. Multiple studies have shown that Th1/Th2 cytokine imbalance exists in T2D patients [10][11][12]. However, the specific mechanism of STAT4 in T2D is still unclear. Therefore, our study conducted a study on the association between STAT4 genetic polymorphisms and T2D risk in Chinese Han population. This study will supplement the data of T2D susceptibility-associated genetic loci.
Our results showed that only rs3821236 was associated with type 2 diabetes risk among the five candidate SNPs of STAT4 (rs3821236 A/G, rs11893432 G/C, rs11889341 T/C, rs7574865 T/G and rs897200 C/T). STAT4 is an important transcriptional activator. After activation, it crosses the nuclear membrane into the nucleus in the form of a homodimer, and then initiates the transcription and expression of downstream target genes [22]. Numerous studies have found that the STAT4 rs3821236 genetic polymorphism is associated with multiple disease risks, such as systemic lupus erythematosus (SLE) [23], Systemic sclerosis [24] and juvenile idiopathic arthritis [25].     In recent studies reported by Zhao et al. [10] and Mahlangu et al. [11], they all found that the differentiation regulation of Th1/Th2 played a certain role in T2D. And it has been found that STAT4 plays a certain role in the regulation of Th1/Th2 differentiation. Combined with the results of our study, we speculated that STAT4 rs3821236 may play a certain role in the differentiation and regulation of Th1/Th2, which may influence T2D susceptibility. However, this is only a speculation, which may need further study in larger sample size to confirm. Nevertheless, as far as we know, our study is the first to find evidence that STAT4 rs3821236 is potentially associated with the occurrence and development of T2D in Chinese Han population. It will provide new ideas for the individualized treatment or diagnosis of T2D.
On the other hand, genetic and environmental factors are interrelated in T2D and promote its development. The previous study has shown that age, obesity and unhealthy lifestyle are risk factors for T2D [26]. Therefore, this study also conducted a stratified analysis related to the above. Our results showed: among the population ≤ 60 years old, rs3821236, rs11893432 and rs11889341 of STAT4 were significantly associated with increased T2D risk; among the population with BMI < 24, rs11889341 and rs7574865were significantly associated with increased risk of T2D; among the non-drinking population, rs3821236, rs11893432, rs11889341 and rs7574865 had a certain association with the increased risk ofT2D; in the analysis of whether the participants smoked, there was no significant association between STAT4 gene polymorphism and T2D risk. The above results seemed to be inconsistent with previous studies. We were pleasantly surprised to find that although there was no significant association between STAT4 gene polymorphism and T2D susceptibility among participants with potential T2D risk, it was showed an increasing trend of T2D risks among these participants. The result indicates that STAT4 gene polymorphism is associated with increased T2D risk, which may be greatly affected by genetic factors, while the environmental factors may have little effect.
In addition, we found that there are some differences between the results of our study and previous studies: STAT4 rs7574865 gene polymorphism is a risk factor for increasing the risk of diabetes in Asians and Caucasians [27], while according to the results of this study, rs7574865 was only associated with the clinical indicator (cystatin C, p = 0.033). However, it is not sufficient to prove that rs7574865 is associated with T2D risk. We speculate that the causes for the above differences may be different research populations, inconsistent sample sizes and different research environments etc.
Our study provides data supplement for the study of the association between STAT4 gene polymorphism and the risk of T2D in Chinese Han population: there is a certain association between the two. However, this study still has certain limitations. Because of the small sample size and missing sample data (BMI, drinking, smoking). Only two baselines of age and gender were adjusted in the logistic regression to ensure the accuracy of the results. In subsequent studies, we need to further expand the sample size to continue the study, so as to more strongly confirm the results of our study.

Conclusion
In summary, the study is the first study of the association between STAT4 gene polymorphism and T2D risk in Chinese Han population. Our results suggest that STAT4 gene polymorphism (rs3821236, rs11893432, rs11889341, rs7574865, rs897200) has a potential association with the risk of T2D in the Chinese Han population. It provides supplementary data for the in-depth study of the association between the STAT4 gene and T2D risk. And it can provide a theoretical and scientific basis for the preliminary molecular basis of prevention and treatment for T2D from a genetic perspective.