Skip to main content

Table 1 Initial selection of possible targets for subsequent probe design, and final selection of tandem repeats for capture and sequencing

From: Mapping the landscape of tandem repeat variability by targeted long read single molecule sequencing in familial X-linked intellectual disability

# Selection groups Predicted variability Total repeat length Unit length Copy number N Results of modified probe design (final selection)
1 probes 2 probes 3 probes 4 probes All targeted Not included
1 Coding repeats Any SERV Any total length Any unit Any copy num. 368      305 82,88% 63 17,12%
  Repeat length ≤ 500 bp 353 36 37 74 158     
2 Regulatory repeats (top variability) SERV ≥1 Any total length Any unit Any copy num. 181      149 82,32% 32 17,68%
  Repeat length ≤ 500 bp 174 23 36 78 12     
3 Regulatory repeats (lower variability) 0,4 < SERV < 1 Repeat length ≤ 520 bp (all) Any unit Any copy num. 390 62 55 100 101 318 81,54% 72 18,46%
4 Additional regulatory repeats within 1 kb from the genes involved in XLID (not yet included in groups 2–3) Any SERV (−0,92 ̶ 0,37) Any total length (<  250 bp) Any unit Any copy num. 68 2 5 19 39 65 95,59% 3 4,41%
  Total (‘functional’)      1007 123 133 271 310 837 83,12% 170 >16,88%
5 Intronic repeats only SERV > 0,8 Repeat length ≤ 500 bp Unit ≥2 bp ≥ 15 copies 3431 filtered out filtered out 516 24 540 15,74% 2891 84,26%
6 Intergenic repeats only SERV > 1 Repeat length ≤ 500 bp Unit ≥2 bp ≥ 15 copies 4126 440 20 460 11,15% 3666 88,85%
  Total (‘unknown significance’)      7557    956 44 1000 >13,23% 6557 86,77%
  Total (all)      8564> 123 133 1227 354 1837 21,45 6727 78,55%
  1. SERV≥1 corresponds to the high predicted variability