Prevalence and clinical phenotype of the triplicated α-globin genes and its ethnic and geographical distribution in Guizhou of China

α-thalassemia is relatively endemic in Guizhou province of southwestern China. To predict the clinical manifestations of α-globin gene aberration for genetic counseling, we examined the prevalence of the α-globin triplication and the genotype–phenotype correlation in this subpopulation A cohort of 7644 subjects was selected from nine ethnicities covering four regions in Guizhou province of China. Peripheral blood was collected from each participant for routine blood testing and hemoglobin electrophoresis. PCR-DNA sequencing and Gap-PCR were used to identify the thalassemia gene mutations. Chi-square tests and one-way analysis of variance (ANOVA) were used to statistically analyze the data. We found that the frequency of α-globin triplication in Guizhou province was 0.772% (59/7644). Genotypically, the αααanti4.2/αα accounted for 0.523% (40/7644), the αααanti3.7/αα for 0.235% (18/7644), and the αααanti3.7/–SEA for 0.013% (1/7644). The αααanti4.2/αα is more prevalent than the αααanti3.7/αα in Guizhou. In addition, the frequency of the HKαα/αα (that by GAP-PCR is like αααanti4.2/-α3.7) was 0.235% (18/7644). Ethnically, the Tujia group presented the highest prevalence (2.47%) of α-globin triplication. Geographically, the highest frequency of the α-globin triplication was identified in Qiannan region (2.23%). Of the triplicated α-globin cases, 5 coinherited with heterozygote β-thalassemia and presented various clinical manifestations of anemia. These data will be used to update the Chinese triplicated α-globin thalassemia database and provide insights into the pathogenesis of thalassemia. These findings will be helpful for the diagnosis of thalassemia and future genetic counseling in those regions.


Background
Thalassemia is a hereditary hemoglobin disease caused by defects in the globin genes, including deletions and mutations [1]. Based on the gene involved, thalassemia is usually classified into α-thalassemia and β-thalassemia [2,3]. While a deletion of one or both α-globin genes leads to α-thalassemia, the α-globin genes triplication (ααα) that caused by homologous recombination between the duplicated α-globin genes (Fig. 1), rarely causes detectable clinical symptoms because the clinical blood parameters and manifestations appear normal [4][5][6]. However, when in coinherited with β-globin gene mutation(s), the triplicated α-globin genes play a considerable role in pathophysiology of thalassemia by deteriorating the imbalanced α-globin chain synthesis and affecting the erythroid maturation and survival [7,8], mild to severe thalassemia (transfusion-dependent anemia) is often observed in the affected subjects due to the imbalance of α-and β-globin chains [8][9][10]. Patients with severe thalassemia usually rely on lifelong blood transfusion therapy, which is a heavy healthcare burden for their families and society. There are two types of triplicated α-globin genes: ααα anti3.7 and ααα anti4.2 [11,12]. The ααα anti4.2 is commonly observed in Asians while the ααα anti3.7 is more prevalent in Africans, Middle Eastern, and Mediterranean populations [7,[11][12][13]. In addition, a type of unusual rearrangement of the α-globin gene cluster, called HKαα (Hong Kong αα) allele, contains both the -α 3.7 and ααα anti4.2 crossover junctions [14,15]. But the HKαα allele does not really contain three copies of α-globin gene. Thus, HKαα is not an α-triplication allele.
Guizhou province, located in southwestern China, is one of the regions with the highest rate of α-thalassemia in Asia [16]. The population consists of several ethnic groups including many minorities such as Yao, Miao, Buyi, Dong, Tujia, Zhuang, Shui and Gelao. Although thalassemia patients with β-globin gene defects and triplicated α-globin genes had been reported worldwide [9,17,18], the frequency of triplicated α-globin genes in this population has never been investigated. Thus, we conducted an epidemiological study to elucidate the frequency and clinical features of triplicated α-globin genes in this population and region.

Subjects
Guizhou province contains four regions. Two representative counties/cities from each region were taken for study. Inclusion criteria: subjects whose residence in these regions exceeded 3 years at the time of recruitment, regardless of age, sex, and ethnicity. In total, 7866 participants were recruited by simple random sampling method from 8 counties/cities (Congjiang, Liping, Tongren, Libo, Liupanshui, Kaili, Zunyi, and Anshun) in four regions (Qiannan, Qiandongnan, Qianbei and Qianxi) of Guizhou province in China from February 2014 to June 2016. Eventually, blood samples and health information were collected from 7644 qualified people for investigation based on the inclusion criteria (Fig. 2).

Blood analysis
Approximately 5 ml peripheral blood was collected from each participant for routine blood testing (Sysmex hematology analyzer, K-1000, Sysmex Corporation, Kobe, Japan) and hemoglobin electrophoresis (Bio-Rad Laboratories, Hercules, CA, USA).

DNA sequencing and genotyping
Approximately 3 ml of peripheral blood was collected from each subject. DNA extraction was conducted by using the Magen nucleic acid extraction kit (Magen, Guangzhou, China). Four pairs of globin gene specific PCR primers (HBA1, HBA2, HBB-1, and HBB-2) were Fig. 1 Schematic generation of the two α-triplications (ααα anti3.7 and ααα anti4.2 ) through homologous recombination between the duplicated α-globin genes. a Generation of the ααα anti3.7 triplication and the three subtypes of rightward deletion (-α 3.7I,II,III ) due to unequal crossing over between two misaligned Z boxes of the α1-and α2-globin genes and reciprocal events; b Generation of the ααα anti4.2 triplication and deletion of -α 4.2 from recombination between the two misaligned X-homology boxes. Note: X, Y, Z, homology boxes; A, Apa I restriction site designed and synthesized by the Beijing Genome Institute (BGI)-Shenzhen. For identification of HKαα, nextgeneration sequencing plus Gap-PCR were adopted. All the globin gene-specific PCR primers were owned and patented by the BGI-Shenzhen, and unpublicized. PCR was carried out in a volume of 25 μl with an amplification reaction system containing 1 pair of tag primers, 50-200 ng DNA, and 2 × Gold Star Taq Master Mix (Kangwei century). The PCR amplification was performed using the ABI9700 (Perkin-Elmer Applied Biosystems Inc., Foster City, CA  [19]. Gap-PCR was used to detect some α deletion genotypes, as described previously [15,19]. Information from subjects with triplicated α-globin genes combined with β-thalassemia clinical manifestations were collected for further analysis. The nomenclature and description of the α-globin gene variants identified followed the HGVS guidelines (http:// www. HGVS. org/ varno men).

Statistical analysis
Continuous variables are summarized by descriptive statistics, including the mean and range or standard deviation. Categorical variables are presented as number and percentage, and the comparisons of frequencies and mean were completed by using the Chisquare test and one-way analysis of variance (ANOVA). A statistically significant difference was defined as a p < 0.05. Statistical analyses were performed with SPSS 17.0 (SPSS Inc., Chicago, IL, USA).

The ethnic distribution of the triplicated α-globin genes and the HKαα allele
As listed in Table 2, there are 9 ethnic groups inhabiting these regions. Except for the Han ethnic group, the other ethnic groups are minorities in China. The ethnicity of the participants was determined by questioning. In this investigation, the highest frequency of the ααα anti4.2 /αα was identified in the Tujia ethnic group (1.65%), followed by the Han, Dong, Shui, Buyi, and Miao. The frequency of the ααα anti4.2 /αα in the Tujia group was significantly higher than in the Miao (p = 0.015), Dong (p = 0.022), and Buyi (p = 0.045). For the ααα anti 3.7 /αα genotype, the highest rate was observed in the Tujia group as well, followed by the Dong, Han, Buyi, and Miao. No ααα anti 3.7 / αα carrier was observed in the Shui group. No triplicated α-globin genes were detected in the Zhuang, Yao, and Gelao groups.

The geographical distribution of the triplicated α-globin genes and the HKαα allele
As listed in Table 3, the highest prevalence of the α-globin gene triplication was observed in Qiannan region (2.23%), followed by Qiandongnan, Qianbei, and Qianxi. The frequency in Qiannan was significantly higher than in any other regions including Qiandongnan (p = 0.03), Qianbei (p = 0.001), and Qianxi (p = 0.0045). There was no significant difference observed between the other regions. While the prevalence of the ααα anti 4.2 / αα genotype was significantly higher in Qiannan region than in other regions, the ααα anti3.7 /αα was commonly found in Qiannan and Qiandongnan regions. However, although the frequency of the ααα anti3.7 /αα was higher in Qiandongnan, there was no statistically significant distribution difference between those regions (p > 0.05). In addition, the HKαα/αα was mainly distributed in Qiandongnan, but its distribution had no statistically significant difference between those regions (p > 0.05). The ααα anti3.7 /-SEA was quite rare in those regions, and only one case was detected in Qiandongnan.

Genotype-phenotype associations of α-globin gene rearrangements
The frequency of thalassemia-gene carriers in Guizhou is 11.03%, with 7.41% of α-thalassemia-gene frequency    and 3.23% of β-thalassemia-gene frequency (unpublished data). Therefore, the frequency of α-thalassemia is higher than β-thalassemia in Guizhou province, China. While deletion of α-globin genes causes α-thalassemia, the triplicated α-globin genes alone rarely cause obvious clinical symptoms. All 59 carriers of the triplicated α-globin genes including the ααα anti3.7 /-SEA case, and 18 cases of the HKαα/αα did not presented any clinical manifestations such as anemia at the time of examination. The blood parameters including the red blood cells (RBC), hemoglobin (HGB), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), HbA, HbA2, and HbF were measured and statistically analyzed through ANOVA. No significant parameter difference was identified among the three different genotype groups (p > 0.05) (Tables 4, 5). All of the hematological parameters appeared within the normal range. However, when triplicated α-globin genes coinherit with β-globin gene mutation(s), the affected subjects present various clinical manifestations from no symptoms to severe anemia. In this study, we identified 5 subjects who were cocarriers of the β-globin gene alterations and the α-globin gene triplication. All 5 affected subjects were of Han ethnicity. Mut-01 was a boy who suffered from severe anemia when he was 7 years old and was treated with regular blood transfusions at a local hospital since he was diagnosed with β-thalassemia. Our DNA sequencing demonstrated that he was a carrier of codons 41/42 (-TTCT) beta 0 (HBB: c.126_129delCTTT) and αα/ααα anti4.2 (Table 6). Mut-02 and Mut-03 did not present with severe symptoms but exhibited visible paleness. No abnormalities were observed in their hearts, lungs, and nervous systems. Table 4 The blood parameters of the α-globin gene triplication and HKαα groups RBC, red blood cells (normal range: 4-5 × 10 12 /L); HGB, hemoglobin (normal range: male, 120-170 g/L; female, 110-160 g/L); MCV, mean corpuscular volume (normal range: 80-100 fL); MCH, mean corpuscular hemoglobin (normal range: 27-32 pg)

Genotypes
Number   The blood tests indicated that all three boys suffered from moderate hypochromic microcytic anemia without detectable iron deficiency or other related abnormalities. Mut-04 had the identical β-globin mutation as Mut-01, and his disease manifestation was similar to that of Mut-01. Mut-04′s father was also a β-thalassemia sufferer (HGB, 83 g/L; MCV, 62.2 fL; MCH, 19.6). DNA sequencing indicated that they were all carriers of the β-globin gene alterations and the α-globin gene triplication (Table 6). Mut-05 was an adult woman. She had no detectable symptoms at the time of examination, although she carried a codon 17 (AAG > TAG) beta 0 (HBB:c.52A > T) and the αα/ααα anti4.2 .

Discussion
To date, at least two genotypes of α-globin triplication have been described: ααα anti4.2 /αα, ααα anti3.7 /αα. Although the HKαα shows by GAP-PCR the positivity for ααα anti4.2 and -α 3.7 , it is not considered as α-globin triplication due to no real extra copy of the α-globin gene existed in the HKαα allele. Observations have indicated that α-globin triplication alone does not cause detectable clinical manifestations [20]. However, when an α-globin triplication coinherits with β-globin gene mutation(s), the combined defects usually lead to the emergence of variable clinical phenotype including asymptomatic presentation, significant anemia, ineffectual erythropoiesis, and mild to severe clinical symptoms. Thus, it is essential to determine the prevalence of the triplicated α-globin genes because the α-globin triplication usually exacerbates β-thalassemia when it is coinherited with β-globin defects because the extra copy of the α-globin chain leads to an unbalanced ratio between the α-and β-globin chains if associated with β-thalassemia. In this study, we randomly selected a cohort of 7644 subjects in four regions of Guizhou province, China. These participants' ethnicities included Han and eight ethnic minorities.
Our study demonstrated that the population prevalence of the α-globin triplication in Guizhou province was 0.772%. This figure was slightly lower than that identified in Guangdong (1.2%), a province in southeastern China [17], and in the Dutch population (approximately 1.2%) [7]. In addition, the ratio of ααα anti3.7 and ααα anti4.2 was different between the two subpopulations. In our study, the ratio of ααα anti3.7 /ααα anti4.2 was 0.47 (0.248% ααα anti3.7 /0.523% ααα anti4.2 ) in Guizhou province, while it was 3.0 (0.9% ααα anti3.7 /0.3% ααα anti4.2 ) in Guangdong province. This finding suggests that the ααα anti4.2 triplication is rather common in Guizhou, while the ααα anti3.7 is prevalent in Guangdong region. Ethnically, the Tujia group presented the highest prevalence (2.47%) of the α-globin triplication. In particular, the prevalence of ααα anti4.2 in Tujia was significantly higher (1.65%) than in any other ethnic group. This is the first report that Tujia have a higher frequency of α-globin triplication. Whether this higher rate of α-globin triplication is caused by a founder effect or not requires further investigation. Although previous studies have reported that the frequencies of α-globin defects in the Zhuang and Yao ethnic groups were significantly higher than that in the Han ethnic group in Guangxi, another province in southwestern China [21], we did not observe any carrier of α-globin triplication in Zhuang and Yao minorities, probably due to the small size of the ethnic groups or that they were not affected by the α-globin triplication in those regions. The high frequency of α-globin triplication identified in those two ethnic groups in Guangxi province might be caused by a founder effect initiated by genetic drift and particular lifestyles, inhabitation density, and endogamous marriage.
Geographically, the highest frequency of the α-globin triplication was identified in Qiannan region (2.23%). The frequency difference of the α-globin triplication between Qiannan and any other region was statistically significant. Moreover, the frequency of ααα anti4.2 in Qiannan was also higher than in other regions. The other type of α-globin triplication, ααα anti3.7 /-SEA , and the non-α-triplication HKαα/αα had no significant geographical differences. In addition, we did not identify any anti-HKαα allele, although we found that the frequency of HKαα allele was 0.235%, significantly higher than previously reported. To exclude the presence of HKαα in case of positivity for the -α 3.7 and ααα anti4.2 , next-generation and Gap-PCR are needed to be simultaneously performed. Interestingly, while the HKαα is hardly observed in other regions, it was relatively common in Qiandongnan and its frequency is comparable to that of the ααα anti 4.2 triplication there ( Table 3). The HKαα was first described by Wang [14]; then, Shang et al. first reported that the population prevalence of HKαα was 0.07% and that of anti-HKαα was 0.02% in Guangxi province of China [22]. Afterwards, the population prevalence of HKαα was determined to be 0.07% and that of anti-HKαα to be 0.003% in Guangdong province [23]. The differences between our findings and the previously reported data could be due to geographical pattern differences and population diversity [24].
In our study, the hematological parameters and hemoglobin electrophoresis data of the HKαα, ααα anti4.2 , and ααα anti3.7 carriers were all within the normal range, which is consistent with previous reports [14,22]. Therefore, carriers of the HKαα, ααα anti4.2 , and ααα anti3.7 will not present clinical manifestations such as anemia. Of note, although the HKαα carriers presented a normal range of hematological parameters, their RBC, hemoglobin, MCH, and MCV were all slightly lower than the ααα anti4.2 and ααα anti3.7 carriers, implying that the particular cluster structure that could reduce the α-globin gene expression.
As mentioned above, the triplicated α-globin genes alone barely lead to detectable clinical phenotypes. In this study, of the 59 cases of α-globin genes triplication, 5 cases coinherited with β-globin gene mutation(s) while the other 54 subjects did not present any clinical symptoms. Although many reports have stated that α-globin triplication can exacerbate the symptoms of β-thalassemia, the issue is still controversial because the expected worsened anemia has not occurred in all cases [7,18]. In our subjects, the first four carriers presented with β-thalassemia from mild to severe. In the case of Mut-05, the α-globin triplication combined with the β-globin gene mutation (CD17 (AAG > TAG)) surprisingly failed to cause thalassemia; the reason merits further investigation. In addition, 18 cases of the HKαα/ αα, and 1 case of the ααα anti3.7 /-SEA did not presented any clinical manifestations such as anemia at the time of examination, which is consistent with previous reports [22,23].
Currently, Gap-PCR and PCR combined with RDB (reverse dot blot) methods are commonly used to detect α-globin gene deletions and the β-globin gene defects, but they usually miss the triplicated α-globin genes. In this study, whole genome NGS combined with Gap-PCR was adopted to screen for all types of α-globin and β-globin gene alterations, including α-globin gene deletion, triplication, splicing mutations, which would be expected to increase the detection sensitivity and improve the diagnosis of β-thalassemia.

Conclusions
This epidemiological study has identified the current α-triplication genotypes and their prevalence and distribution in Guizhou province, which will be used to update the triplicated α-globin thalassemia database, provide insights into the pathogenesis of thalassemia and shed light on the diagnosis of thalassemia in southwestern China.