The AM analysis in the current study confirms the results of previous studies [13, 14] which identified a genomic region on chromosome 22 conferring increased genetic risk for AA to develop non-diabetic ESKD. This finding was recently extended to Hispanic Americans, whose risk for ESKD is intermediate between that of AA and Europeans, and further higher association SNPs within MYH9 were identified, and proposed to narrow the region containing a putative disease phenotype causal mutation initially pointing to the MYH9 gene [20, 30]. Since these initial reports, the gene now thought to contain the functional causative mutation accounting for the AM peak is the neighboring APOL1 gene, encoding the apolipoprotein L-1 protein, with the MYH9 marker associations being the result of very high levels of linkage disequilibrium, powerful evolutionary selection and hitchhiking of the MYH9 variants with putative APOL1 causative mutation [28, 29].
In contrast to the two previously reported AM studies, our results were obtained using a different and independent marker panel, a different and independent patient sample set, and two different analytic approaches. The importance of replicating identification of genetic risk loci for common complex disease susceptibility has been highlighted as a sine qua non for proceeding with further mechanistic research for such a locus, and certainly for genetic counseling and clinical applications. In the case of AM in particular, it has been recommended that data should be analyzed using at least two of the several different available ancestry inference programs . Failure to replicate has been a common feature in the search for common variants of candidate genes affecting the risk of complex diseases . In the case of ESKD among AA, a long standing controversy existed as to the relative contribution of societal and demographic as opposed to genetic factors in the increased risk for ESKD [32, 33]. Furthermore other rationales concerning the etiology of the higher prevalence of ESKD have been proposed, such as lower mortality rates of AA ESKD patients . The current study represents the third AM study confirming the presence of a genetic susceptibility locus for ESKD in African Americans- and also confirms that the increased risk conferred by this locus seems to be restricted to non-diabetic ESKD. In the sub-group of non-diabetic ESKD AA, the increased genetic risk conferred by the African ancestry risk allele explains the majority of the increased risk (1.6 for heterozygote and 1.87 increased risk in the epidemiological studies). Though the locus susceptibility was reproducible in all 3 studies, the exact location of the risk causative mutation(s) cannot be inferred from AM studies alone. Indeed, even the initial candidate gene (MYH9) implicated in the first two AM studies, has since been superseded by a more statistically and biologically plausible gene (APOL1), though it has been the AM approaches which focused attention to the chromosome 22 region containing these genes in the first place[13, 14].
Each of the three studies used a different sample set, but all of the samples were of African American heritage. The current study also utilized a different AIMs panel and two different analytical methods, confirming the utility of AM as an effective mapping approach using relevant populations and disease phenotypes. The use of two different analytic approaches also reduces the risk of false positive results inherent in the methodology of AM and emanating from the complex patterns of LD among panel SNPs in ancestral populations. Future research will be needed in order to validate the genetic risk conferred by the chromosome 22 loci in other populations, other stages of CKD, and possibly other modifier loci, taking into the account the complex heritability and multi-factorial nature of ESKD[35, 36]
The current study also highlights additional points of more general interest pertaining to AM which might be of interest and utility in application to other sample sets and disease states - especially when only small sample sets or weaker ERRs are involved. Examples could include the increased susceptibility to systemic lupus erythematosus (SLE)  and the higher incidence of lupus nephritis among AA SLE patients [38, 39], or diseases with reduced prevalence among AA (e.g. HCV clearance) and psoriasis. In particular, we propose use of a stepwise hierarchical approach and novel ancestry panel tools in AM analysis projects where ERR or sample size may be limiting. Furthermore, we have used a case only study design, which for a given sample size and marker set, has been suggested by others to be more powerful than a case control study design (reviewed in Montana et.al ). However, it should be noted that such a case only study design, might be more sensitive to inaccuracies in estimation of specified allele frequencies. This is less problematic with current builds of dbSNP, which are based on larger sample sets in comparison to the SNPs datasets that were available in earlier admixture mapping studies. Notwithstanding these reservations, the analysis based on real samples in the current study, in which the admixture peak and its genetic loci have been validated in independent studies, lends further confidence to the predicted power and accuracy of an appropriately formulated case only study design.
The results of the power analysis of the non-enriched screening panel in the current study, demonstrated a non-homogenous distribution of ancestry information across the genome (Additional File 3). The main differences were observed while inspecting the distribution of ancestry information along several chromosomes, which exhibited marked differences in the information content among and along the lengths of chromosomes, generating a situation in which the mean information content is quite stable over the genome as a whole, but with marked regional differences in the distribution of information content in comparison to the genome as a whole. This was the case for chromosomes 18, 19 and 22 - the latter of which turned out to be the chromosome harbouring the genetic susceptibility locus of interest. While fortuitous - this turned out to be informative in the formulation of an effective step wise approach of potential general utility. Thus the heterogeneity of information content, involving chromosome 22, likely explains the differences in the power of the non-enriched screening panel to identify true positive loci along chromosome 22 (Figure 1). Presumably the regions on chromosome 22 with lower information content yield lower accuracy in the ancestry inference and therefore reduced the capacity to identify a disease risk locus in the power experiments, which were conducted systematically over all of the chromosomes. This would be expected to yield a false negative result for a true disease risk locus on chromosome 22, as was indeed the case for the actual sample set analysed - which yielded suggestive peaks according to pre-set criteria as outlined in Methods, but peaks which did not meet rigorous association significance scores. Having achieved the suggestive peaks - this limitation was then overcome using both the EMI enrichment process which strives to maximize the chromosome-wide information by taking into account mutual marker information, together with the power enhancement afforded by the enriched panel and adjustment of the sample size accordingly.
Since the information content is not homogeneously distributed over the chromosomes, the relative risk conferred by an associated locus is not known in advance, leading to the expectation of a shift in the estimated power. Therefore, we suggest that following the addition of AIMs, re-analysis of power should be undertaken in order to estimate the number of samples needed for a locus of interest which has arisen in a screening panel. It should be noted that this hierarchical approach, using an initial genome wide screening panel - could potentially overlook candidate peaks (type II error), which would then not make it to the step of scrutiny by an enriched panel on the chromosome or region of interest. It is reassuring that in the current study - an initial screening panel of only 2,016 SNPs and what turned out to be an initial sample set of only 277 relevant (non-diabetic ESKD) samples, led to the appropriate next steps, using ANCESTRYMAP and ADMIXMAP as the initial ancestry inference programs, which converged on one shared risk locus.
An additional important lesson from the current study is the inferential power achieved by combining two different analytic and computational platforms, once genotype information is available. Although previous guidelines for AM methodology recommended the use of at least 2 analytical approaches , most published AM studies have generally used only one statistical analysis platform [13, 14, 43–48]. The study by Nalls et.al used all three currently available statistical programs (ADMIXMAP, STRUCTURE, and ANCESTRYMAP), and concluded that it was difficult to directly compare the significance values from the three programs, because each program calculates a different statistic for association . In the current study we followed this recommendation and used both ANCESTRYMAP and ADMIXMAP as statistical platforms. Tian et.al  had previously reported a high rate of false positive results when using ANCESTRYMAP at chromosomal positions in complete LD. In contrast, no false-positive peaks were observed in the same simulations using ADMIXMAP. Our goal in the screening panel step of the current study was to identify one or more candidate true positive regions containing a common risk variant.
ANCESTRYMAP was used because of its unique LGS local score, which can point to a possible association for a given chromosome, unlike ADMIXMAP software which instead provides statistics for individual markers only. Indeed, in the current study, ANCESTRYMAP yielded no peaks across all chromosomes for the entire sample set, but did yield a candidate peak in the non-diabetic subset of samples (Figure 2, 3), consistent with the previous reports by Kopp et.et  and Kao et.al . A second peak on chromosome 15 was also evident. The second peak in ANCESTRYMAP on chromosome 15 was not observed using ADMIXMAP (Additional File 6). Only chromosome 22 yielded a candidate locus using both ANCESTRYMAP and ADMIXMAP. It was this finding which then prompted the next enrichment step directed to this locus.
The goals of the enrichment process were to validate and fine tune the screening panel results, and to narrow the region. The results of ANCESTRYMAP with its LGS local parameter yielded two candidate peaks on chromosome 22 (Figure 4). Based on the simulation by Tian et. al  we considered the possibility that one of these peaks might be a false positive result, consequent to the effect of very high levels of LD in the ancestral populations. Indeed using ADMIXMAP, it was clearly demonstrated that the deliberate omission of SNPs with the highest D' values in the ancestral populations pointed only to the peak at nucleotide positions 35-37 Mbs, and giving a significant association (- LOG (P) > 10) (Figure 4). This result highlights the importance of spacing SNPs in such a way as to avoid spurious peaks presumed due to LD interference, which in the case of the current study required spacing of SNPs at distances greater than 50 kB. Tian et al  used simulations to suggest that residual LD, even at spacing of 100 kB as in the screening panel, can yield spurious peaks, using a case only design and ANCESTRYMAP, but not ADMIXMAP as the statistical analysis approach. Using experimental data, and both a screening and enriched panel, we have been able to verify this effect of residual LD, which occurs using either ANCESTRYMAP or ADMIXMAP - but which can be overcome by addition of appropriately spaced markers so as to avoid presumed LD interference. It is evident from the current study that use of ADMIXMAP alone, will not overcome this effect, as had been suggested based on simulation studies . Our results support previous observations that residual LD in the ancestral populations causes false positive signals, thus limiting the density of AIMs that can be used and highlighting the utility of using both multiple analytic approaches, as well as selecting carefully spaced enrichment markers, especially for moderate risk loci. We can recommend 50 kB as an empirical lower limit to spacing of AIMS in AM, with judicious use of more than on analytic approach and choice of coinciding peaks for the further detailed search for candidate genes.
In the current study it may be observed that the p values obtained from ADMIXMAP, were more significant in comparison to those obtained using the ANCESTRYMAP LOD scores (Figure 4 reduced panel). One possible explanation is that ADMIXMAP does not apply any correction for multiple hypothesis testing . Furthermore as can be observed in Figure 4 ADMIXMAP results are greatly influenced by the omission of dense SNPs (with high LD in ancestral populations), while the corresponding changes for ANCESTRYMAP are only moderate. However, we caution that while highly tenable, we cannot conclude definitively that this is the source of discordance between ANCESTRY MAP and ADMIXMAP, since each of these analytical programs calculates a different statistic for LD.