Mutation screen in the GWAS derived obesity gene SH2B1 including functional analyses of detected variants

Background The SH2B1 gene (Src-homology 2B adaptor protein 1 gene) is a solid candidate gene for obesity. Large scale GWAS studies depicted markers in the vicinity of the gene; animal models suggest a potential relevance for human body weight regulation. Methods We performed a mutation screen for variants in the SH2B1 coding sequence in 95 extremely obese children and adolescents. Detected variants were genotyped in independent childhood and adult study groups (up to 11,406 obese or overweight individuals and 4,568 controls). Functional implications on STAT3 mediated leptin signalling of the detected variants were analyzed in vitro. Results We identified two new rare mutations and five known SNPs (rs147094247, rs7498665, rs60604881, rs62037368 and rs62037369) in SH2B1. Mutation g.9483C/T leads to a non-synonymous, non-conservative exchange in the beta (βThr656Ile) and gamma (γPro674Ser) splice variants of SH2B1. It was additionally detected in two of 11,206 (extremely) obese or overweight children, adolescents and adults, but not in 4,506 population-based normal-weight or lean controls. The non-coding mutation g.10182C/A at the 3’ end of SH2B1 was only detected in three obese individuals. For the non-synonymous SNP rs7498665 (Thr484Ala) we observed nominal over-transmission of the previously described risk allele in 705 obesity trios (nominal p = 0.009, OR = 1.23) and an increased frequency of the same allele in 359 cases compared to 429 controls (nominal p = 0.042, OR = 1.23). The obesity risk-alleles at Thr484Ala and βThr656Ile/γPro674Ser had no effect on STAT3 mediated leptin receptor signalling in splice variants β and γ. Conclusion The rare coding mutation βThr656Ile/γPro674Ser (g.9483C/T) in SH2B1 was exclusively detected in overweight or obese individuals. Functional analyzes did not reveal impairments in leptin signalling for the mutated SH2B1.


Background
A large-scale genome-wide association study (GWAS) meta-analysis including a total of 249,796 individuals of European ancestry confirmed 14 known obesitysusceptibility loci and newly identified 18 genetic loci associated with body mass index (BMI). One of the reidentified single nucleotide polymorphisms (SNPs) is located near the Src-homology 2B adaptor protein 1 (SH2B1) gene (rs7359397) [1]. Association with obesity was also shown for a coding SNP in SH2B1 (rs7498665: g.8164A/G, Thr484Ala; [2,3]). Linkage disequilibrium between rs7359397 and the coding SNP rs7498665 is high (r 2 = 0.965, D' = 1; HapMap, http://hapmap.ncbi. nlm.nih.gov/). Both SNPs are located within a large linkage disequilibrium (LD) block. A region of more than 500 kb upstream and 150 kb downstream of SH2B1 is flanked by recombination peaks of 37 cM/Mb or 36 cM/ Mb, respectively (SNP Annotation and Proxy Search SNAP, see Additional file 1: Figure S1). The association of increased BMI with SH2B1 SNP (rs7359397 and rs7498665) alleles has been robustly replicated in e.g. (i) 4,923 Swedish adults [4], (ii) in 12,462 individuals from the German MONIKA/KORA study [5], and (iii) in 1,045 obese adults and 317 healthy lean individuals from Belgium [6].
A deletion of~200 kb covering SH2B1 was recently shown to be associated with severe early-onset obesity [7], whereas the corresponding reciprocal duplication was associated with leanness [8]. Additionally, a larger interspersed deletion extending through a 593 kb region on chromosome 16p11.2-p12.2 covering SH2B1 has been associated with developmental delay, feeding difficulties, dysmorphic facial features, and obesity [9,10]. Bochukava et al. [7] screened the coding region of SH2B1 for causal mutations by re-sequencing of 500 early onset severely obese children of the Genetics of Obesity Study (GOOS). The investigators detected SNP rs7498665 (Thr484Ala [7]); rare variants were not identified. Evidence in humans and from animal models suggests that SH2B1 is a likely obesity gene. In humans, the SH2B1 protein increase serum leptin levels and whole body fat mass in females [11]. The influence of SH2B1 variants on the distribution of body fat and the amount of visceral adipose tissue is still under discussion [4,12,13]. With regard to animal models, Sh2b1 null mice show a phenotype of obesity, hyperlipidemia, leptin resistance, hyperphagia, hyperglycaemia, insulin resistance and glucose intolerance [14]. This phenotype was consistent when the knockout was regionally limited to hypothalamic neurons [15] and functionally limited to induced mutations in the Src-homology 2 (SH2) and pleckstrin homology (PH) domains [16]. Selective rescue in neurons eliminated both obesity and the insulin resistance phenotype [16]. Additional evidence for an involvement of Sh2b1 in the regulation of energy homeostasis is derived from expression analyses in mice and rats. In DIO (diet-induced obese) rats, fed a high fat diet, the expression of Sh2b1 in hypothalamus was decreased [17], while in mice on a high fat and high carbohydrate diet, Sh2b1 expression increased in the same tissue [18].
We initially detected transmission disequilibrium for a SNP in the vicinity of SH2B1 (rs2008514) in 705 obesity trios (p =0.0094 of TDT). The SNP is a proxy of rs7359397, which lit up in large scale GWAS [1]. We screened the coding region of SH2B1 for (infrequent) mutations in 95 extremely obese children and adolescents. For the GWAS-derived gene for type 2 diabetes mellitus melatonin receptor 1B (MTNR1B) it was recently shown that a number of rare to infrequent mutations can be detected in the respective patients [19]. In addition, based on the evidence for involvement of SH2B1 in energy homeostasis, rare coding variants in the gene could potentially result in monogenic obesity. Subsequently, we assessed association of the identified variants to obesity in independent study groups. In vitro analyzes of the impact on leptin receptor signalling for the detected variants ensued.

Study groups
An overview of the ten study groups is given in Table 1, details have been described previously [20,21]. The selection of individuals for the mutation screen was based on genotypes at SNP rs2008514 (proxy of rs7359397) in the vicinity of SH2B1. In total, we analyzed 95 individuals, 90 of whom were likely enriched for the presence of mutations in SH2B1. The other five individuals are heterozygous carriers of a deletion at chr16p11.2 which does not harbor SH2B1 [10]. These extremely obese patients (offspring) from the family-based GWAS sample were homozygous for the risk allele T of rs2008514 and had at least one heterozygous parent, thus substantially contributing to the observed overtransmission of the rs2008514 T-allele. Association to obesity of detected variants was analyzed in three steps ( Figure 1): i. Association testing: All detected variants were genotyped in a sample of 179 extremely obese (age-and sex-adjusted BMI percentile ≥ 99 th ; [22]) children or adolescents and 185 lean adult (age-and sex-adjusted BMI percentile < 15 th ; [22]) controls. Basic phenotypical characteristics are given in Table 1. Individuals were independent of the mutation screening sample and were part of our case-control GWAS sample (Genome-Wide Human SNP Array 6.0, see [20,21]).
ii. Further exploration of non-synonymous variants: The three non-synonymous variants (rs147094247: g.2749C/A -Thr175Asp; rs7498665: g.8164A/G -Thr484Ala; g.9483C/T -βThr656Ile/γPro674Ser) were additionally genotyped in the remaining individuals of the family-based sample and the cases and controls of our GWAS sample, who were not screened for mutations (see Mutation Screen section) and in 988 obese adults [23] as well as in 1,185 independent obese children or adolescents of the 'Datteln Paediatric Obese Cohort' (DAPOC [24]; Table 1). iii. Additional exploration of rare non-synonymous variants: The two coding mutations without previously shown association to obesity βThr656Ile/ γPro674Ser and rs147094247 -Thr175Asp -were additionally genotyped in three independent study groups comprising obese children and adolescents ('Berlin Paediatric Obese Cohort' -BEPOC; n = 1,046 [25]; Ulm Children's Study 2 and 3; n = 271 and 129, respectively [26]; Table 1) and in two independent population-based cohorts, the 'Cooperative Health Research in the Region of Augsburg' (KORA; n = 10,077 [27]) cohort of adults, and the 'Ulm Children's Study 1' (n = 782 [28]) study group of children and adolescents (Table 1).
In all samples, body mass index (BMI in kg/m 2 ) was assessed and age-and sex-specific percentile criteria with regard to the German population at the time of sample recruitment (S I , S II , S III , S IV , S VI [22]) were used to define overweight or obese cases. Samples were divided into cases (adults: BMI ≥ 25, children and adolescents: ≥ 90 th BMI percentile (www.mybmi.de)) and controls (adults: BMI ≤ 25, children and adolescents: BMI ≤ 90 th percentile (www.mybmi.de)). Written informed consent was given by all participants and in case of minors by their parents. These studies were approved by the Ethics Committees of the respective Universities and Institutions and were performed in accordance with the Declaration of Helsinki.

Mutation screen
The coding region of SH2B1, located at chr16: 28,858,010 -28,885,533 (hg18/ NCBI 36), was screened for mutations by denaturating high pressure liquid chromatography (dHPLC, WAVE, Transgenomics) as described previously [29]. Accuracy of dHPLC is similar or even higher than sequencing [30]. To enhance the sensitivity of detection of homozygous mutation carriers, DNA of an individual with wild type genotype (re-sequenced) was added to each sample prior to PCR amplification. All samples with deviant dHPLC patters were re-sequenced. Detailed information regarding used temperatures and primers for PCR and dHPLC analysis can be obtained from the authors. All PCR amplicons with dHPLC patterns deviant from the wild-type pattern were resequenced as described previously [31]. At least two experienced individuals independently assigned the genotypes; discrepancies were solved either by reaching consensus or by re-genotyping.

Genotyping
The identified variants in SH2B1 were genotyped in larger study groups using either restriction fragment length polymorphism (RFLP) or TaqMan Assays (detailed information can be obtained from the authors). For rs147094247 (Thr175Asp) and the new coding mutation in fragment 9 (g.9483C/T βThr656Ile/γPro674Ser), custom TaqMan assays were designed (SH2B1_2I_MUT, Assay ID: AHCS0BY, and Frag9_mut1, Assay ID: AHMSHDX, respectively, both Applied Biosystems). At least two experienced individuals independently assigned the genotypes; discrepancies were solved either by reaching consensus or by re-genotyping.

Statistics
Allele and genotype distributions of all detected variants did not deviate from Hardy-Weinberg equilibrium. To analyze the obesity association of all variants, Fisher's exact test (allelic association) was calculated with PLINK [32]. Population-based samples were divided into cases (BMI ≥ 90 th percentile) and controls (BMI < 90 th percentile). For rs7498665 an asymptotic, 2-tailed p-value for the transmission disequilibrium test (TDT) was additionally calculated with PLINK. If not stated otherwise, all p-values are asymptotic, two-sided and not corrected for multiple testing.

Functional in silico analyzes
To determine the potential alteration in gene expression, all mutations were analyzed for loss or gain of cryptic splice sites, transcription factor binding sites and gain or loss of o-glycosilation sites. Prediction of possible impact of amino acid exchange on structure and function of SH2B1 was done by PolyPhen-2 [33], SNAP [34], PMUT [35], and MutationTaster [36]. Detailed description of used tools can be found in the Additional file 1: Supplementary materials. Conservation was analyzed by aligning sequences of 21 species in total (21 α, eight β and six γ sequences, comprehensive list in Additional file 1: Supplementary materials).
Functional in vitro analyzes: STAT3 mediated leptin receptor signalling The effect of SH2B1 harboring the infrequent alleles of Thr484Ala and βThr656Ile/γPro674Ser on leptin receptor activity was determined with a quantitative reporter gene assay (adapted from [37]). HEK293 cells were transiently transfected with the murine long form of the leptin receptor (Lepr-b) in pcDNA3.1, a signal transducer and activator of transcription 3 (STAT3) responsive Photinus luciferase construct (pAH32), a constitutive Renilla luciferase expression vector for data normalization (phRG-b, Promega) and human SH2B1 splice variants beta and gamma with and without the mutations in pCMV-XL5 expression vectors (Lepr-b and pAH32 were kindly provided by Rosenblum et al. [37]). For control empty pcDNA3 expression vector was transfected instead of SH2B1. After stimulation with mouse leptin (concentrations 0, 0.5, 1, 5 , 10, 50, 100, 500 ng/ml), the STAT3 reporter construct led to luciferase expression measured with the Dual-Luciferase Reporter Assay system according to manufactures' instruction (Promega). Dose-response curves, EC50 and Emax values were calculated by Graph Pad Prism.

Results and discussion
We performed a mutation screen of the coding region of SH2B1 in 95 extremely obese children and adolescents. We identified two unknown mutations and five known SNPs in SH2B1 ( Table 2). All detected variants were followed up in a small, independent case-control sample, only non-synonymous variants were additionally genotyped in further independent study groups.

New infrequent variant βThr656Ile/γPro674Ser
A new mutation at position g.9483 (C/T) of SH2B1 results in a non-synonymous, non-conservative exchange in two of the three human splice variants (β and γ) of SH2B1. Due to a shifted reading frame for the two splice variants, the mutation results in two different non-synonymous, non-conservative exchanges (βThr656Ile or γPro674Ser) in the β or γ splice variants, respectively. The βThr656Ile/ γPro674Ser mutation was not detected in an independent sample of 179 extremely obese cases and 185 lean controls. As the mutation resulted in a non-synonymous amino acid exchange which was predicted to change protein structure (Table 3), we additionally genotyped a total of 11,029 (extremely) obese or overweight children, adolescents and adults and 4,321 controls (for children and adolescents BMI < 90 th percentile, for adults BMI < 25kg/m2) for this mutation. We detected two additional obese cases with this mutation and no mutation carrier among the controls. The extremely low frequency of the mutation limits the determination of an association to overweight and obesity (p = 1; Table 2). We calculated that the control group would need to include more than 545,757 individuals to reveal a p-value below 0.05 with statistical power above 80%, if the observed trend (only mutation carriers among the overweight or obese individuals) would remain stable. Both risk alleles (T-allele at g.9483C/T and G-allele at rs7498665) are potentially located on the same haplotype (as determined in one index patient and his mother who transmitted the haplotype; for the other carriers, full genotype information of both parents was not available). A founder effect of this mutation is likely. All three detected βThr656Ile/γPro674Ser mutation carriers are female. The initially identified mutation carrier (a) of the screening sample (height 163 cm, weight 86.2 kg, BMI 32.44 kg/m 2 , age 12.7 years) as well as one mutation carrier (b) from the follow-up samples (height 142 cm, weight 53.2 kg, BMI 26.38 kg/m 2 , age 9.9 years) had a BMI > 99 th percentile. In both cases, the overweight or obese mother (BMI 25.76 kg/m 2 and 32.61 kg/ m 2 , respectively) transmitted the mutation to the extremely obese child. The third mutation carrier (c) had a BMI > 90 th percentile (height 130 cm, weight 31.5 kg, BMI 18.64 kg/m 2 , age 7.2 years). For this mutation carrier, genotypic information about the parents was not available (mother BMI 19.81 kg/m 2 , father BMI 26.7 kg/ m 2 ). Insulin levels were only available for one mutation carrier (b), whose level was in the normal range (9.4 mU/l). Additional family members were not available.
The amino acid exchanges in the β and γ splice variants (βThr656Ile/γPro674Ser) are located outside the domain structure (self-dimerization, Pleckstrin-homology and SH2 domain; [34]), which is relevant for the function of SH2B1. We analyzed the impact of the variant rare allele of βThr656Ile/γPro674Ser on leptin signalling in vitro via the STAT3 pathway. In both splice variants, β656Ile or γ674Ser showed unaltered leptin signalling (Figure 2, Additional file 1: Table S2). While the γPro674Ser exchange in the γ splice variant is predicted to be neutral by in silico programs, the exchange of βThr656Ile in the β splice variant was predicted to be "not neutral" (SNAP), "pathological" (PMUT) or "disease causing" (Mutation taster) in three of four programs (Additional file 1: Table S1). The exchange in the β splice variant would also destroy a predicted O-glycosylation site of the SH2B1 protein. Amino acid conservation was strong on both positions (86% for βThr656Ile and 100% for γPro674Ser, Additional file 1: Table S1).
Previous in vitro data showed functional differences between β and γ SH2B1 variants: (a) It has been shown that the β splice variant is mainly expressed in the hypothalamus [16], a brain region known to be implicated in weight regulation. A Sh2b1β rescue was sufficient to prevent the Sh2b1 knockout phenotype in mice [19]. Leptin signalling is mediated by the interaction of SH2B1β and JAK2 [17]. The β splice variant of SH2B1 recruits insulin receptor substrates 1 and 2 (IRS1 and 2) to the LEPRb/JAK2 complex [38]. SH2B1β enhances JAK2 activity and promotes the activation of several downstream networks like STAT3 and phosphatidylinositol (PI)   Table S2). [18,39]. Two in silico analyzes predicted an altered folding or function of SH2B1 βThr656Ile (Additional file 1: Table S1). (b) The γ splice variant of SH2B1 is peripherally expressed. It interacts with Tyr1158 in the activation loop of the insulin receptor and prohibits dephosphorylation of IRS1 and IRS2 [40]. This interaction enhances insulin signalling and insulin receptor autophosphorylation, leading in turn to activation of downstream pathways [41]. While the three tyrosine motifs in the N-terminal part of SH2B1, which regulate interaction with the insulin receptor [42], are not directly affected by this exchange, it is possible that altered protein folding due to an non-conservative amino acid exchange in a highly conserved position prevents their phosphorylation.

Coding GWAS derived SNP rs7498665
We confirmed that the described risk allele G at SNP rs7498665 [2,3] is associated with obesity in our 705 obesity trios (p = 0.009) and in a total of 3,139 independent cases and 434 controls (p = 0.007, odds ratio (OR) = 1.22, 95% confidence interval (CI) 1.06-1.42; Table 1). This coding SNP results in the non-synonymous, nonconservative amino acid exchange Thr484Ala in a slpice variant independent position with low conservation (5%, Additional file 1: Table S1). As the association with obesity was previously described for this SNP, we did not analyze further study groups (2)(3)(4)(5)(6). In vitro analyses revealed unaltered leptin signalling via STAT3 in both splice variants for the obesity risk allele at Thr484Ala (Figure 2). Both Emax and EC50 were non-significantly reduced ( Figure 2, Additional file 1: Table S2).
Since the obesity risk allele at the GWAS derived polymorphism rs7498665 increases BMI by only approximately 0.15 BMI units (kg/m 2 ) as calculated in a population of 125,931 European individuals [1], we expected only subtle functional alterations associated with the minor allele of this variant.

Other genetic variants in SH2B1
The second newly detected mutation is located in the 3' UTR at base pair position g.10182 (C/A). This noncoding variant was detected twice within the screening sample, and once in an obese case in the association testing step (Figure 1). The variant showed no association to obesity in a small case control comparison (p = 0.49; Table 2); in silico analyzes predicted a possible change in splice sites for this variant (Table 3).
Results for the four other identified SNPs were as follows: The third coding SNP rs147094247 leads to a nonsynonymous, conservative exchange (Thr175Asp) at a conserved position (71%, Additional file 1: Table S1). No association to obesity (p = 0.199, odds ratio (OR) = 4.4, 95% confidence interval (CI) 0.57 -34.13; Table 1) was found for this SNP in a sample of 11,268 obese and overweight cases and 4,512 lean or normal weight controls (mostly population based). For the non-synonymous SNP rs147094247 (Thr175Asp), in silico analyzes predict a neutral outcome for the altered amino acid (Additional file 1: Table S1).

Leptin signalling
The assay that measured STAT3 mediated leptin response successfully showed increased leptin response after co-transfection with wild type SH2B1 splice variants β and γ. This indicates that HEK293 cells and the STAT3 assay allow functional characterization of SH2B1. While in mice only the alpha splice variant was tested for leptin signalling [43], we observe an effect on leptin signalling for both other splice variants (β and γ) in our human cell system (Figure 2, Additional file 1: Table S2). The analysis of the impact of both SH2B1 variants on leptin receptor activity showed no significant reduction of STAT3 mediated signalling by the risk alleles at rs7498665 and βThr656Ile/γPro674Ser. The nonsignificant decrease in EC50 and Emax for both tested variants in splice variants β and γ could indicate both gain of function and reduced function; the biological impact of both remains to be solved. If indeed a minor functional effect would be present, a much larger number of replicates would be necessary to establish a significant effect (e.g. about 2x270 replicates when using the Emax point estimates of SH2B1γ vs. SH2B1γP674S and their variances when applying Satterthwaite/Welch t-Test aiming at 80% power for two-sided α = 5%). Our results could, of course, indicate that the two variants are not functionally relevant. However, for a polygenic variant a large functional effect is rather unlikely. For example, the melanocortin 4 receptor gene (MC4R), a well known obesity gene, harbors two polymorphisms (Val103Ile and Ile251Leu) that are negatively associated with obesity [44,45]. Carriers of the minor alleles have a BMI approx. 0.5 BMI units lower than wild type carriers [44,45]. Initial in vitro assays did not show functional implications for the minor alleles of these SNPs (e.g. [44]), but when the number of different assays was increased, in vitro tests showed potential small gain of function for both minor alleles [46], which could explain the weight lowering effect of the variant. Hence we speculate that the functional effect of the analyzed SH2B1 variants might become detectable when a battery of different functional tests is applied. Currently we have first hints that both variants are compatible with a slightly reduced function. In addition, with STAT3 mediated leptin signalling, we only tested one of the many potential interaction partners of SH2B1 in regulation of energy homeostasis. A potential additive effect of small functional changes in leptinergic and insulinergic signalling could result in stronger impact on body weight maintenance.

Conclusion
A recent mutation screen in 300 children from the GOOS cohort which display insulin resistance in addition to obesity revealed three variants and one SNP that showed an effect on cell differentiation and migration, but with the exception of the frameshift variant Phe344fs no other functional deficiencies [47]. Comparable to our study, Doche et al. analyzed the impact of detected variants on janus kinase 2 (JAK2) phosphorylation with additional tests of insulin receptor substrate 2 (IRS2) phosphorylation and SH2B1 dimerization [47].
Given the low frequency of βThr656Ile/γPro674Ser (g.9483C/T), this mutation cannot explain our positive TDT for rs2008514 with obesity. Adding the three rare mutations detected by Doche et al. which show low functional impact still leaves a large proportion of BMI association inexplicable [47]. The region around SH2B1 on chr 16p11.2 shows low recombination rates for approximately 1Mb (chr16:28,177,800 shows a recombination peak of 37cM/Mb and chr16:28,944,400 a recombination peak of 36 cM/Mb; HapMap, http://hapmap.ncbi.nlm.nih.gov/), implicating a large region with high linkage disequilibrium. The area tagged by both BMI associated SNPs (rs7498665 and rs7359397 [1][2][3]) covers 17 genes (compare Additional file 1: Figure S1). Hence, relevant mutations in one of the remaining 16 genes might account for a larger proportion of the GWAS results. Alternatively, genetic variation outside of the SH2B1 coding region with a regulatory effect on this gene explains the association in functional terms. Guo et al. recently showed an intronic SNP in SH2B1 (rs4788099) which regulated mRNA expression of nearby genes Tu translation elongation factor, mitochondrial (TUFM), coiled-coil domain containing 101 (CCDC101), Homo sapiens spinster homolog 1 (SPNS1), sulfotransferase family, cytosolic, 1A, phenol-preferring, member 1 (SULT1A1) and sulfotransferase family, cytosolic, 1A, phenol-preferring, member 4 (SULT1A4) in B cells and monocytes [48]. This is in concordance with findings by Gutierrez-Aguilar et al. who reported differential regulation of Sh2b1, Tufm and Sult1a1 in rats fed a high fat diet [17].
In conclusion, the rare allele of the variant βThr656Ile/ γPro674Ser in SH2B1 was found exclusively in three overweight or obese children but not in normal-weight or underweight controls. Our findings suggest that this new rare mutation predisposes to increased BMI, possibly related to decreased leptin signalling. Further studies are warranted to investigate the functional impact of the mutation for both affected splice variants on the interaction of SH2B1 effector systems (e.g. leptin and insulin receptors), which play a major role in energy homeostasis.

Competing interests
Winfried Rief declares that he received financial support for presentations and for scientific advice from Astra Zeneca, Heel, and Berlin Chemie; he also declares that this did not influence the content of this manuscript. All other authors declare that they have no competing interests.