In the first part of this study we set out to establish a range of experimentally observed values for array variance on Illumina's SNP-genotyping beadarrays. At the same time, we wanted to establish a range of values for pool construction variance. In the second part, we used these estimates to calculate the effective sample size of a DNA pool given a range of replicate array values, and provide an online tool to allow readers to do the same.

At the time of our analysis we were aware of only one report that estimated array variance (var(e_{array})= 1.1 × 10^{-4} ) for an Illumina HumanHap300 beadarray [18]. Illumina has since released higher density arrays (>1 million SNPs per array), and we wanted to determine if increased SNP density negatively impacted array variance. Overall, we found this was not the case. All of the Illumina array types examined here (660-Quad, 1M-Single, 1M-Duo) had very similar var(e_{array}) estimates, centering around 3 × 10^{-4} for our normalized data, which is largely in keeping with the HumanHap300 result [18]. We expect this result would extend to the HumanOmni1-Quad array, although it was not analyzed it here. We found that the normalization procedure we used reduced the array variance between 2-8-fold, and a newly reported normalization algorithm suggests that array variance can be reduced even further [24]. Reduced array variance should mean more precise estimates of allele frequency, which should further minimize the loss of power associated with using the DNA pooling strategy.

The Illumina arrays analyzed here yielded var(e_{array}) estimates ~10-fold smaller than those of the Affymetrix *Hind*III 50K arrays (var(e_{array})= 1.26 × 10^{-3}) analyzed by MacGregor [17]. A similar result was noted when Affymetrix arrays were compared to Illumina HumanHap300 arrays [18]. In part, this may be explained by differences in the manufacturing of the arrays. MacGregor et al. [18] report that pooling errors appear to be highly related to number of probes used to estimate SNP allele frequency. While 10 probe pairs are assigned to each SNP on the Affymetrix *Hind*III 50K arrays [18], on average 16-18 beads are used on the Illumina arrays. Further, on Illumina arrays beads are randomly dispersed on a slide [22], while on Affymetrix arrays probes are fixed in a given location, making the latter more susceptible to location-specific technical errors. As the array variance gets smaller (i.e. when using Illumina arrays), we expect the pool-construction variance to account for a greater proportion of the pooling variance.

Our estimates of var(e_{construction}) spanned 27 DNA pools, ranging in size from 74 to 446 individual samples, allowing us to sample a range of possible pool construction variances. First, in contrast to a previous report [25], we did not observe a relationship between pool size and pool-construction variance. We did, however, observe batch effects. For the 1M-Duo arrays, which were processed in two batches on different dates, we observed very different estimates of pooling variance and pool-construction variance (see Figure 6). Most of our estimates of pool-construction variance were based on values from Type C comparisons, and for these var(e_{construction}) usually fell between 20 and 40% of the pooling variance. When calculations were based on the comparison of replicate DNA pools (Type B comparisons, 1M-Single arrays only) our estimates were smaller, on average 7.5% of the pooling variance. There are several possible reasons for this. The adjustment for binomial sampling variance may not fully account for the variance arising from sampling, leaving variance that is then attributed to pool-construction in the Type C comparisons. As well, some estimates of pool-construction variance were negative, and these were set to zero, which would lead to overestimation of pool-construction variance. We conclude that relative to var(e_{array}), var(e_{construction.}) is of less importance; however, our results suggest pool construction may account for more of the pooling variance than previously estimated [17]. MacGregor [17] attributed 12.5% of the pooling variance to pool-construction when using Affymetrix *Hind*III 50K arrays. On average we attribute 30% of pooling variance to pool construction when using Illumina arrays. This difference is what might be expected given the smaller var(e_{array}) for Illumina arrays. Further reductions in array variance, for example, through improved normalization of array data, have the potential to further shift the proportion of an experiment's pooling variance that is attributed to pool-construction errors.

With respect to the design of pool-based experiments when using Illumina arrays, our partitioning of the pooling variance still suggests [17] that constructing fewer (large) pools while using more replicate arrays (i.e. target array variance), is the most effective way to reduce pooling variance and conduct the most efficient pool-based GWAS. Further, for an equivalent pool-based experiment using Affymetrix arrays in place of Illumina arrays, more array replicates will be needed (~10-fold more). As the proportion of array variance to pool construction variance approaches 50:50, strategies to reduce pool construction variance become more important.

For one of our experiments, 1M-Duo Batch 2, we observed unusually high estimates of pool-construction variance and low estimates of array variance (see Figure 6). In this experiment, pool replicates were allelotyped on the same physical array (which holds two samples). Subsequently, we noticed that the array variance for replicates on the same chip were much smaller than the variance for replicates on different chips. Overall, this led to the array variance being underestimated relative to the pooling variance, leaving more variance to be accounted for by pool construction. In addition, the between-chip variance for these arrays was much higher than observed in the 1M-Duo Batch 1 dataset, which lead to large estimates of pooling and pool-construction variance overall. Ultimately, this was traced back to unusually high red channel intensity on some arrays, despite normalization, which biased allele frequency estimates array-wide. Clearly this will influence any downstream association analysis, so in this case, our analysis of variance served to flag a serious problem in the array data. It also highlighted the need to randomize DNA pool replicates among arrays that carry more than one sample, and to randomize by location on the array, particularly in the case of the 660-Quad and HumanOmni1-Quad arrays, which carry four samples.

The differences between 1M-Duo Batch 1 and 2 data were significant for normalized data, but not raw data. On one hand, it may be that greater noise associated with the raw data prevented differences in array variance and pool construction variance from being significant. On the other, it is possible that the normalization procedure itself exacerbated technical artifacts only present on some arrays, leading to the observed differences in normalized data. This can occur if technical artefacts violate the assumptions of the normalization [26].