Discovery of Genomic Structural Variations with Next-Generation Sequencing Data
Advanced Topics in Computational Genomics
Slides from Marcel H. Schulz, Tobias Rausch (EMBL), and Kai Ye (Leiden University)
Discovery of Genomic Structural Variations with Next-Generation - - PowerPoint PPT Presentation
Discovery of Genomic Structural Variations with Next-Generation Sequencing Data Advanced Topics in Computational Genomics Slides from Marcel H. Schulz, Tobias Rausch (EMBL), and Kai Ye (Leiden University) Computational Methods Detecting
Slides from Marcel H. Schulz, Tobias Rausch (EMBL), and Kai Ye (Leiden University)
Reference Split-Read alignments Read depth signals Mate-pair or paired-end mapping abnormalities
courtesy of Tobias Rausch (EMBL)
Unmapped or single-anchored reads Reference Split-Read alignments Read depth signals Mate-pair or paired-end mapping abnormalities Local assembly courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
Insertions Deletions
courtesy of Tobias Rausch (EMBL)
Korbel et al. (2007) Lee et al. (2009)
courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
courtesy of Tobias Rausch (EMBL)
Chiang et al. (2009)
courtesy of Tobias Rausch (EMBL)
Xie et al. (2009)
courtesy of Tobias Rausch (EMBL)
Chiang et al. (2009) Human cancer cell lines compared to normal cell lines (SeqSeq algorithm, no fixed window size, multiple change points method )
Donor Reference
Donor Reference
Donor Reference Medvedev et al. (2009)
Ye et al. (2009) How to do that?
Ye et al. (2009) ① Use 3’ end of left read as anchor point ② Use pattern growth to search for minimum and maximum unique substrings from the 3′ end of the unmapped read (<=2x insert size)
!"!#$%&$'($)!*!++ +,
#&)-./!'0&12-./!(3!%0&&$).!/)45&2
courtesy of Kai Ye (Leiden U.)
!"!#$%&$'($)!*!++ +,
#&)-./!'0&12-./!(3!%0&&$).!/)45&2
courtesy of Kai Ye (Leiden U.)
!"!#$%&$'($)!*!++ +,
#&)-./!'0&12-./!(3!%0&&$).!/)45&2
courtesy of Kai Ye (Leiden U.)
!"!#$%&$'($)!*!++ +,
#&)-./!'0&12-./!(3!%0&&$).!/)45&2
courtesy of Kai Ye (Leiden U.)
!"!#$%&$'($)!*!++ *!
#&),-.!'/&01,-.!(2!%/&&$)-!.)34&1
5,-,'6'!6-,76$!86(8&),-.9!:;< 5/=,'6'!6-,76$!86(8&),-.9!:;<> courtesy of Kai Ye (Leiden U.)
Ye et al. (2009) ① Use 3’ end of left read as anchor point ② Use pattern growth to search for minimum and maximum unique substrings from the 3′ end of the unmapped read (<=2x insert size) ③ Use pattern growth to search for minimum and maximum unique substrings from the 5’ end of the unmapped read (read length + Max_D) starting from mapped end in step 2
Ye et al. (2009) ① Use 3’ end of left read as anchor point ② Use pattern growth to search for minimum and maximum unique substrings from the 3′ end of the unmapped read (<=2x insert size) ③ Use pattern growth to search for minimum and maximum unique substrings from the 5’ end of the unmapped read (read length + Max_D) starting from mapped end in step 2 ④ check if complete unmapped read can be combined from 3’ and 5’ end substrings matches
Ye et al. (2009) ① Use 3’ end of left read as anchor point ② Use pattern growth to search for minimum and maximum unique substrings from the 3′ end of the unmapped read (<=2x insert size) ③ Use pattern growth to search for minimum and maximum unique substrings from the 5’ end of the unmapped read (read length -1) starting from mapped end in step 2 ④ check if complete unmapped read can be combined from 3’ and 5’ end substrings matches
Ye et al. (2009) ① Use 3’ end of left read as anchor point ② Use pattern growth to search for minimum and maximum unique substrings from the 3′ end of the unmapped read (<=2x insert size) ③ Use pattern growth to search for minimum and maximum unique substrings from the 5’ end of the unmapped read (read length -1) starting from mapped end in step 2 ④ check if complete unmapped read can be combined from 3’ and 5’ end substrings matches
Ye et al. (2009)
Ye et al. (2009)
Ye et al. Pindel manual a) large deletion b) tandem duplication c) inversion d-f) same as a-c with non-template sequence (yellow part)
Kai Ye, Marcel H. Schulz, Quan Long, Rolf Apweiler, and Zemin Ning Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics (2009) 25(21): 2865-2871 Pindel homepage: https://trac.nbic.nl/pindel/ SplazerS homepage: http://www.seqan.de/projects/splazers.html