An assessment of the portability of ancestry informative markers between human populations

Background Recent work has shown that population stratification can have confounding effects on genetic association studies and statistical methods have been developed to correct for these effects. Subsets of markers that are highly-differentiated between populations, ancestry-informative markers (AIMs), have been used to correct for population stratification. Often AIMs are discovered in one set of populations and then employed in a different set of populations. The underlying assumption in these cases is that the population under study has the same substructure as the population in which the AIMs were discovered. The present study assesses this assumption and evaluates the portability between worldwide populations of 10 SNPs found to be highly-differentiated within Britain (BritAIMs). Methods We genotyped 10 BritAIMs in ~1000 individuals from 53 populations worldwide. We assessed the degree to which these 10 BritAIMs capture population stratification in other groups of populations by use of the Fst statistic. We used Fst values from 2750 random markers typed in the same set of individuals as an empirical distribution to which the Fst values of the 10 BritAIMs were compared. Results Allele frequency differences between continental groups for the BritAIMs are not unusually high. This is also the case for comparisons within continental groups distantly related to Britain. However, two BritAIMs show high Fst between European populations and two BritAIMs show high Fst between populations from the Middle East. Overall the median Fst across all BritAIMs is not unusually high compared to the empirical distribution. Conclusion We find that BritAIMs are generally not useful to distinguish between continental groups or within continental groups distantly related to Britain. Moreover, our analyses suggest that the portability of AIMs across geographical scales (e.g. between Europe and Britain) can be limited and should therefore be taken into consideration in the design and interpretation of genetic association studies.


Background
Whole-genome association studies (GWASs) have proven extraordinarily successful in mapping loci that associate with common complex human diseases [for reviews see [1,2]]. Whereas candidate gene and linkage analyses have identified a few dozen replicable associations between genetic markers and complex diseases [3], GWASs have provided compelling evidence for more than 150 genedisease associations since their introduction in 2006 [1]. The presence of population stratification has presented one of the main statistical challenges in GWASs. Population stratification refers to differences in allele frequencies between cases and controls related to ancestry rather than disease status. Long before technologies for GWASs were available, it was recognized that differences in ancestry between cases and controls can present a substantial confounding effect in case-control studies [4]. This is especially true in cases where disease risk differs between groups with different ancestry. For example, prostate cancer is more frequent in individuals of African ancestry compared to individuals of European ancestry [5], and previous significant genetic associations with prostate cancer become nonsignificant when correcting for these differences in ancestry [6]. The presence of population stratification can inflate false positive rates or cause reduced power and it has become standard practice to evaluate and correct for genetic ancestry in GWASs [for a review see [7]].
Currently there are two widely-used approaches for correcting for population stratification in GWASs: structured association (SA) [8] and principal components analysis (PCA) [9]. SA uses the program STRUCTURE [10] to estimate the number of sub-populations, k, and then for each individual assigns a probability of membership to each of k subpopulations. It is then tested whether allele frequencies are dependent on phenotype within each k subpopulation. PCA reduces high-dimensional data to a small number of dimensions and uses the axes of variation, or eigenvectors, from these dimensions to calculate ancestryadjusted genotypes and phenotypes. Both of these methods rely on inferences of ancestry from genome-wide SNP data. It has been shown that accurate estimates of individual ancestry can be obtained from a subset of SNPs from genome-wide data and these are referred to as ancestryinformative markers (AIMS) [for a review see [7]]. AIMs are characterized by substantially different allele frequencies between populations and can be used to estimate the proportion of an individual's ancestry that is derived from these populations. Before running GWAS, AIMs can be used to match cases and controls, and outlier individuals whose ancestry is not typical of the population under study can be excluded [11]. The main intention for the development of sets of AIMs, however, is to provide a set of markers that effectively control for population stratifi-cation in association studies in which samples have not been typed with genome-wide SNP arrays. These sets of AIMs are designed to capture all of the necessary ancestry information required to correct for stratification in candidate gene studies, in replication studies of GWASs, or in fine-mapping studies that focus on specific genomic regions identified from GWASs.
Sets of AIMs have been developed to distinguish among continental groups [12][13][14][15][16]. These sets of AIMs will be useful in controlling for stratification in admixed populations especially when mapping traits that are known to differ by continental ancestry, for example skin pigmentation [17]. Most GWASs, however, have focused on samples of European ancestry and population stratification within Europe has therefore been assessed in detail [e.g. [18][19][20]]. From several genome-wide SNP data sets, sets of European AIMs have been developed [21][22][23][24][25] that distinguish stratification primarily along north-south and eastwest gradients.
While European AIMs will be useful in studies that examine individuals of diverse European origin, many GWASs focus on cohorts of much more homogeneous ancestry (e.g. individuals from within a single country). It has been shown that even moderate levels of population stratification in relatively homogeneous populations can confound results in well-designed case-control studies [26,27]. For example, spurious associations due to population stratification can arise if samples are drawn from two different cities within a country [e.g. Dresden and Munich in Germany; [18]] and can even appear in genetic isolates like the Icelandic [28] and Finnish populations [29]. Despite these warnings, association studies that do not properly correct for population structure continue to be published [e.g. [30]].
Studies that do incorporate a correction for the confounding effects of population structure are to be commended. However, the correction for population structure is only as good as the markers chosen for the correction. In general, the precondition for the use of a set of AIMs is that the population under study has the same substructure as the population in which the AIMs were discovered. In several recent association studies, this precondition has been left unevaluated and potentially uninformative AIMs have been used to correct for population stratification. For example, Sulem et al. [31] tested for the presence of population stratification in Iceland with a set of AIMs that distinguish between European populations [24]. To correct for population stratification in association studies in Asian populations, SNPs with high Fst between Asians and other continental groups have been employed [32,33]. Similarly, correcting for population structure in Caucasians, Hu et al. [34] use 38 SNPs that are highly dif-ferentiated between continental groups. These studies demonstrate that the underlying assumption that AIMs are largely portable across geographical scales is pervasive.
To assess the portability of AIMs between populations and across geographic scales, we genotyped 10 SNPs found to be highly-differentiated within Britain (BritAIMs) iñ 1000 individuals from 53 populations worldwide. Although these 10 BritAIMs do not constitute a complete set of AIMs that fully capture population structure within Britain, they are nevertheless useful for evaluating the portability of AIMs across geographic scales. We evaluated the usefulness of these BritAIMs as AIMs between and within different continents by comparing Fst values for the BritAIMs to Fst values from 2750 random markers typed in the same set of worldwide samples. Our results suggest that AIMs have limited portability between human populations and that caution is warranted in the use of AIMs discovered in a population whose substructure does not match the population in which they are being employed.

Methods
We selected 13 SNPs identified as "highly-differentiated" within Britain from a data set of ~500,000 SNPs typed iñ 16,000 British individuals [see Table 1; [35]]. These highly-differentiated SNPs had the lowest P values from a χ 2 test of allele frequency difference between 12 geographic regions of Britain defined by postcode. We genotyped these 13 SNPs in the CEPH-HGDP Panel [36]. The CEPH-HGDP panel includes 952 individuals from 53 populations after the removal of atypical and related individuals [37]. Genotypes were generated by KBioscience using a competitive allele-specific PCR SNP genotyping system [38]. Cluster plots were analysed visually and the following 3 SNPs were removed. SNP rs6644913 was removed because it maps to the X chromosome; we restricted our analyses to the autosomes because Fst values from the sex chromosomes are not comparable to auto-somal Fst values. SNP rs3873375 was removed because it maps to multiple genomic regions and shows 4 distinct genotype clusters. Finally, the genotyping of SNP rs1042712 failed completely. As independent verification of the 10 remaining SNPs, our genotype data were compared to genotypes generated from 67 individuals of diverse ancestry from the CEPH-HGDP panel in the laboratory of MS using the Affymetrix GeneChip 500K Mapping Array Set (unpublished data). One SNP, rs12797951, showed substantial discordance between our genotypes and the genotypes from the Affymetrix platform (data not shown). We therefore retrieved genotype data for this SNP from the ~650,000 SNPs typed in the CEPH-HGDP using Illumina HumanHap650K Beadchips [39]. The Illumina and Affy data were consistent for rs12797951 and we therefore used the data from Li et al. [39] for SNP rs12797951. The final set of 10 SNPs are the British ancestry-informative markers (BritAIMs) presented in Table 1.
Genotype calls were made by visual inspection. None of the 10 SNPs were out of Hardy-Weinberg equilibrium after Bonferroni correction for multiple comparisons (53 populations × 10 SNPs = 530 comparisons). In the 7 cases of significant (P < 0.05 without Bonferroni correction) deviation from Hardy-Weinberg expectations for a SNP in a population, cluster plots were re-evaluated and no data were removed. The amount of missing data per SNP ranged from 0% -6.5% with a mean of 3.1%. These data are accessible by request to the corresponding author or from the CEPH database [40].
Fst was calculated according to equation 10 in Weir and Cockerham [41]. Negative Fst values were set to 0. "Global Fst" for each of the 10 SNPs was calculated as the degree of differentiation among the 7 geographic regions represented in the CEPH-HGDP panel. Results remain unchanged when global Fst was calculated as the differentiation among all 53 populations rather than the 7 regions. We compared our observed Fst values for the 10

Global
Europe BritAIMs to an empirical Fst distribution from 2750 autosomal markers (2540 SNPs [42] and 210 indels [43]) typed in 927 individuals from the CEPH-HGDP panel. In cases where the same allele was fixed in all populations being compared, the SNP was considered non-informative and no Fst value was assigned. This resulted in different numbers of observations for the different empirical distributions with a minimum 2286 SNPs making up the empirical Fst distribution within Oceania. To allow for an unbiased comparison to the empirical distribution, Fst for the 10 BritAIMs was calculated from the same set of 927 individuals from which the empirical Fst distribution was calculated. For each BritAIM, a P value was calculated as the proportion of Fst values from the empirical distribution that were ≥ the observed Fst value. We use P < 0.05 as our threshold for "significance". It should be noted that "significant" therefore describes a value only in relation to the empirical distribution.

Results
To assess the portability of AIMs, we used the Fst statistic [44] to measure the degree of genetic differentiation within and between continental groups of ten BritAIMs, i.e. SNPs identified as "highly-differentiated" within Britain [35]. Fst is a commonly employed and useful measure of allele frequency difference between populations and takes on values ranging from 0 (no difference) to 1 (fixed difference). SNPs with high Fst values are highly differentiated between populations and are therefore informative about population structure and are useful as AIMs [e.g. [23,45]]. The list of 10 BritAIMS is presented in Table 1 along with Fst values and associated P values for the among-continent (i.e. global) and within-continent comparisons.
We first tested whether the 10 BritAIMs were highly differ-  Table  1). We assessed population differentiation on a finer geographical scale by calculating Fst within continents. The P values for each of these comparisons are presented in Table 1. Only 4 BritAIMs showed significantly high Fst values (P < 0.05) in the within-continent analyses and these are highlighted in bold in Table 1: two BritAIMs (rs7696175, rs1460133) showed significantly high Fst within Europe and two were significant within the Middle East (rs11790408, rs12797951). Figure 2  To test whether BritAIMs are highly differentiated as a group, we compared the median Fst of the 10 BritAIMs to a distribution of median Fst values from 10 SNPs sampled at random 10,000 times from the empirical distribution. This allows an assessment of how differentiated the 10 BritAIMs are compared to the expectation at random. At the worldwide scale, the median global Fst of the 10 Brit-AIMs does not differ significantly from the expectation generated from 10,000 random samples (Fst = 0.071, P = 0.692). However, the comparisons within each continental group revealed that the median Fst of the BritAIMs is significantly high within Europe (median Fst = 0.0316, P = 0.039; Table 2).
Fst calculations within a continent summarize the degree of allele frequency differentiation among all populations within a continent and can be driven to high values by single outlier populations. Therefore, we investigated the four BritAIMs with significantly high Fst values in more detail by calculating Fst for each pairwise comparison between populations within a continent. Figure 3 shows how these population pairwise Fst values compare to the corresponding Fst values from the empirical distribution. Finally, Figure 4 provides a view of worldwide allele frequencies and population differentiation for rs7696175, the most highly-differentiated BritAIM within Europe. Similar plots of worldwide allele frequencies and population differentiation are provided for each BritAIM in Additional File 1.

Discussion
The presence of population stratification is a potential source of false positives, and thus of spurious associations, in disease association studies. Recently, a number of Global Fst distribution for 2750 random markers      Figure 1). The median global Fst of the 10 BritAIMs is also not unusually high compared to the expectation at random (P = 0.692). Thus, AIMs identified at a fine geographic scale (i.e. within Britain) are not informative on a worldwide scale. This result was foreseeable since there is no a priori reason for expecting SNPs that differ dramatically in allele frequency within Britain to differ dramatically among continental groups.
Within continents, only 4 of the 10 BritAIMs have significantly high Fst values (Figure 2). These 4 BritAIMs are found within the top 5% of the empirical distributions from Europe and the Middle East. These two continental groups are assumed to be more closely related to Britons than the other continental groups included in the present study, and this result therefore suggests that some AIMs may be portable within a restricted geographic range. When the patterns of population differentiation for these 4 BritAIMs are examined in more detail, it is clear that the signal from rs1460133 is derived almost exclusively from the Basque who differ significantly from most of the other European populations for this SNP (Figure 3). Thus, while rs1460133 may be an informative marker for Basque ancestry, it does not show the dramatic gradient of allele frequency across the continent that is characteristic of Worldwide allele frequencies and population differentiation for rs7696175, the most highly differentiated BritAIM within Europe   other European AIMs [24]. It is noteworthy that the median Fst for the 10 BritAIMs is significantly high within Europe (P = 0.039), but not within the other continental groups (Table 2). This observation also supports the notion that the BritAIMs as a set are at least somewhat ancestry informative across European populations. Figure 4 provides a detailed view of the allele frequencies and population differentiation of the BritAIM that shows the highest Fst value within Europe, rs7696175. From the 53 × 53 matrix in Figure 4, it can be seen that the high Fst values for rs7696175 are not restricted to population pairwise comparisons within Europe: the French and Orcadians, for example, have substantially higher minor allele frequencies than several populations from the Middle East and Central South Asia. Similar plots are available for the remaining 9 BritAIMs (Additional File 1) in which it can be seen that the patterns of population differentiation are extremely varied across SNPs: many population-pairwise Fst values lie within the top 5% and even the top 1% of the empirical distribution. Thus the BritAIMs may be useful as AIMs in other groups of populations, but the patterns are often not systematic and their effectiveness in other samples would be difficult to predict a priori.
Previous studies have provided some evidence that AIMs may be portable between human populations. For example, microsatellite markers that are ancestry informative in one population are generally informative in others [45]. Also, genomic regions showing large allele frequency differences between one set of continental groups are likely to be highly-differentiated between other continental groups [46]. However, more recent studies that focus on the portability of AIMs across continental groups provide evidence against this notion. For example, Campbell et al. [47] previously noted that the use of 67 AIMs that were discovered to distinguish between African and European ancestry did not vary sufficiently among Europeans to allow detection of stratification. Similarly, Paschou et al. [48] found that SNPs chosen for ancestry inference in one continent perform no better than random SNPs in inferring ancestry in other continents. These studies have focused, however, on the portability of AIMs across broad geographic scales (i.e. between continental groups) and their conclusions have limited applicability to the design of association studies which usually focus on a more refined geographic scale.
Recently, Heath et al. [18] used PCA to assess population structure in ~6000 Europeans genotyped for ~130,000 SNPs and found that 5 of the 10 genomic regions containing the BritAIMs examined here were significantly associated with PC1 or PC2. It is worth noting that the two BritAIMs for which genotyping failed in the present study were in regions significantly correlated with PC2, and that 4 of the 5 remaining genomic regions containing BritAIMs neared significant correlation with PC1 or PC2 [18]. These data suggest that, while AIMs discovered in a broad panel of Europeans may not perfectly capture ancestry information within Britain, there is substantial overlap among ancestry-informative genomic regions between the two geographic scales.
Local, geographically-restricted, natural selection at a locus generates large allele frequency differences between populations [49,50]. Thus, AIMs are enriched in genomic regions that have been targeted by positive selection and are therefore likely to be in LD with adaptive functional alleles. Viewed from this evolutionary perspective, our finding that BritAIMs are not unusually differentiated between continental groups is not surprising: selection pressures that have generated allele frequency differences within Britain are unlikely to be shared across continental groups because local cultural and physical environments differ drastically at the continental scale. However, the BritAIMs' sharp gradients of allele frequencies across Britain are likely to have been caused by selection pressures shared by other European populations. For example, one BritAIM (rs1042712) is found near the lactase gene which shows a sharp gradient across Europe due to the action of positive selection for lactose tolerance [51]. Thus, the portability of AIMs between populations will depend in part on the extent to which selection pressures have been shared between the populations. Without extensive population genetic analyses, this criterion will be difficult to evaluate.

Conclusion
The assumption that AIMs are portable across geographic scales is pervasive [31][32][33][34]. The data presented here suggest that there is an inevitable loss of power to detect population stratification when AIMs discovered in one population are used in another population. Practically, the assumption that the substructure of the population under study is adequately similar to the substructure of the population in which the AIMs were discovered is often difficult to evaluate. The present analyses suggest the portability of AIMs is limited and that claims of association between genetic variants and phenotypes should be interpreted in accordance with the suitability of the selected AIMs used to correct for population stratification. As association analyses become increasingly common in populations for which genome-wide genotype data is sparse, we anticipate that this cautionary note will become increasingly important.