using dna from many samples to distinguish pedigree
play

Using DNA from many samples to distinguish pedigree relationships of - PowerPoint PPT Presentation

Using DNA from many samples to distinguish pedigree relationships of close relatives Amy L. Williams @amythewilliams February 24, 2020 Family History Technology Workshop Massive datasets: Many close relatives / small pedigrees >100,000


  1. Using DNA from many samples to distinguish pedigree relationships of close relatives Amy L. Williams @amythewilliams February 24, 2020 Family History Technology Workshop

  2. Massive datasets: Many close relatives / small pedigrees >100,000 samples > 9 million samples ~500,000 samples >14 million samples π‘œ π‘œβˆ’1 In dataset with π‘œ individuals, have π‘œ = 𝒫 π‘œ 2 pairs 2 = 2

  3. Goal: detect and reconstruct pedigrees using only DNA …

  4. Signal: Identical by descent (IBD) sharing β€’ Close (and some distant) relatives share large regions identical by descent (IBD) – Represented here as same color β€’ Each generation, parents transmit random Β½ of their genome to children οƒ˜ Relatives separated by 𝑁 generations 1 share average of 2 𝑁 of genome β€’ Average IBD sharing fractions: – Full siblings: 50%, Aunt-nephew: 25%, First cousins: 12.5%

  5. Second degree relatives: All share ~25% of genome IBD Grandparent- Avuncular (AV) Half-sibling (HS) grandchild (GP) οƒ˜ Difficult to distinguish using only data from the pairs

  6. IBD sharing rates for these relationships heavily overlap

  7. Idea: analyze IBD sharing of pair to other relatives

  8. CREST: Classification of Relationship Types Ying Qiao Jens Sannerud

  9. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 𝑦 1 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 𝑧 𝑦 2 Ying Qiao

  10. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 For AV, expect 𝑆 1 = 1/4, 𝑆 2 = 1/2 𝑦 1 𝑧 𝑦 2 Ying Qiao

  11. Approach: ratios of IBD sharing in three samples versus two 𝑆 1 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 𝑆 2 = π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 1 ,𝑧 ∩ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘€π‘“π‘œπ‘•π‘’β„Ž 𝐽𝐢𝐸 𝑦 2 ,𝑧 For GP, expect 𝑆 1 = 1/4, 𝑆 2 = 1 For AV, expect 𝑆 1 = 1/4, 𝑆 2 = 1/2 For HS, expect 𝑆 1 = 1/2, 𝑆 2 = 1/2 𝑧 𝑦 1 𝑦 2 Ying Qiao

  12. CREST uses kernel density estimators to infer relationships Trained kernel density estimators (KDEs) using simulated data Features: 𝑆 1 , 𝑆 2

  13. Can combine multiple relatives by taking union of IBD sharing 𝑧 π‘˜ ’s π‘€π‘“π‘œπ‘•π‘’β„Ž π‘˜ 𝐽𝐢𝐸 𝑦 1 ,𝑧 π‘˜ ∩ π‘˜ 𝐽𝐢𝐸 𝑦 2 ,𝑧 π‘˜ ∩ 𝐽𝐢𝐸 𝑦 1 ,𝑦 2 𝑆 𝑗 = π‘€π‘“π‘œπ‘•π‘’β„Ž π‘˜ 𝐽𝐢𝐸 𝑦 𝑗 ,𝑧 π‘˜

  14. CREST highly sensitive, highly specific Ran PADRE, CREST on 200 replicates of various pedigree structures : CREST : PADRE Qiao, Sannerud et al. (in revision, 2019)

  15. CREST infers relative types in Generation Scotland data Generation Scotland data: 205 GP, 1,949 AV, and 121 HS pairs with at least one mutual relative Given data equivalent to one first cousin (10% of genome covered by IBD regions), CREST’s sensitivity is 0.99 in GP, 0.86 in AV, and 0.95 in HS pairs Qiao, Sannerud et al. (in revision, 2019)

  16. Secondary aim: infer whether relatives are paternal or maternal Paternal Maternal Grandparent Half-siblings

  17. Key insight: males / females have different crossover locations Female rate (cM/Mb) Data from human chromosome 10 Average number of crossovers: Male rate (cM/Mb) β€’ Females: 2.04 β€’ Males: 1.27 Physical position (Mb) Genetic map from BhΓ©rer et al. (2017)

  18. CREST infers maternal / paternal type in Generation Scotland Analyzed all 848 GP and 381 HS pairs in Generation Scotland Using 𝑀𝑃𝐸 = 0 as Half-siblings boundary: β€’ 99.7% of HS β€’ 93.5% of GP Inferred correctly Grandparent-grandchild Qiao, Sannerud et al. (in revision, 2019)

  19. Conclusions β€’ CREST classifies second degree relationship types – Enabled by multi-way IBD sharing β€’ Male / female crossovers reveal the paternal / maternal type of half-siblings and grandparent-grandchild pairs β€’ Can apply to pedigree reconstruction: other methods subject to ambiguities for second degree pairs β€’ Preliminary results indicate CREST also applies to third degree pairs

  20. Acknowledgements Generation Scotland Caroline Hayward Archie Campbell Ying Qiao Jens Sannerud Nancy E. and Peter C. Meinig

  21. Approach: IBD segment ends approximate crossover locations β€’ Model IBD segments as regions flanked by two crossovers No-crossover interval: interior of IBD segment 𝑗int π‘₯ 0 π‘₯ 1 Locations of crossovers: window surrounding IBD segment ends β€’ For each IBD segment 𝑗, likelihood of parent being 𝑇 ∈ {𝐺, 𝑁} is 𝑄 𝑗 𝑇 = 𝑄 π‘₯ 0 𝑇 β‹… 𝑄 𝑗int 𝑇 β‹… 𝑄 π‘₯ 1 𝑇 β€’ Taking all IBD segments to be independent, we compute 𝑗 𝑄(𝑗|𝐺) 𝑀𝑃𝐸 = log 10 𝑗 𝑄 𝑗 𝑁 Jens Sannerud

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend