practical bioinformatics
play

Practical Bioinformatics Mark Voorhies 4/16/2018 Mark Voorhies - PowerPoint PPT Presentation

Practical Bioinformatics Mark Voorhies 4/16/2018 Mark Voorhies Practical Bioinformatics JavaTreeView link-out for ENSEMBL Mouse http://www.ensembl.org/Mus musculus/Gene/Summary?g=HEADER Mark Voorhies Practical Bioinformatics Science! Mark


  1. Practical Bioinformatics Mark Voorhies 4/16/2018 Mark Voorhies Practical Bioinformatics

  2. JavaTreeView link-out for ENSEMBL Mouse http://www.ensembl.org/Mus musculus/Gene/Summary?g=HEADER Mark Voorhies Practical Bioinformatics

  3. Science! Mark Voorhies Practical Bioinformatics

  4. Example Pipeline: Overview Mark Voorhies Practical Bioinformatics

  5. Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le Di ff erential ~2.5-4 years Expression Annotation/ Analysis Paper (publish) Mark Voorhies Practical Bioinformatics

  6. Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Paper (publish) Mark Voorhies Practical Bioinformatics

  7. Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Follow-up Experiments Paper (publish) Mark Voorhies Practical Bioinformatics

  8. Example Pipeline: Overview Generate Genome Transfer Pre-process Samples Coverage & Archival Transcriptome Pro fi le ~1 day Di ff erential ~2.5-4 years Expression Annotation/ Analysis Follow-up Experiments Paper (publish) Mark Voorhies Practical Bioinformatics

  9. Example Pipeline: Overview Mark Voorhies Practical Bioinformatics

  10. Example Pipeline: Details Mark Voorhies Practical Bioinformatics

  11. GSE88801 Pipelines Mark Voorhies Practical Bioinformatics

  12. EM: Expectation Maximization Constrain Update parameters Online EM estimated counts algorithm A C G T A C + G T Error probabilities Bias Output Input ∝ λ ∝ α L Targets (i −1) c m i =m i c −1 i–1 Capture target Fragment and m � sequences sequence i Get next read pair Update masses Relative Estimated Effective abundances counts counts P (−) P (−) P (−) L Align to target references P ( ) ∝ λ L · ρ · ω p |−,L p Augmented · φ − | p, − ,L alignment file Calculate assignment probabilities Roberts and Pachter, Nature Methods 10:71 Mark Voorhies Practical Bioinformatics

  13. Abundance estimation with kallisto transcriptome=”GRCm38 all mRNA” export while read i ; do export jobname=”$ { i } . $ { transcriptome } . f r ” k a l l i s t o quant − i ”$ { transcriptome } . idx ” \ − t 4 −− s i n g l e −− fr − stranded − l 250 − s 50 − o ”$ { jobname } ” ”$ { i } 1 . f a s t q . gz” \ > ”$ { jobname } . log ” \ 2 > ”$ { jobname } . e r r ” done < sample names . t x t Mark Voorhies Practical Bioinformatics

  14. Linear Least Squares Mark Voorhies Practical Bioinformatics

  15. Linear Least Squares b i = y i σ i Mark Voorhies Practical Bioinformatics

  16. Linear Least Squares A ij = f j ( x i ) σ i Mark Voorhies Practical Bioinformatics

  17. Linear Least Squares χ 2 = | A · a − b | 2 Mark Voorhies Practical Bioinformatics

  18. Linear Least Squares M � U i · b � � a = V i s i i Mark Voorhies Practical Bioinformatics

  19. Multiple Hypothesis Testing http://xkcd.com/882/ Mark Voorhies Practical Bioinformatics

  20. Final Homework Implement Needleman-Wunsch global alignment with zero gap opening penalties. Try attacking the problem in this order: 1 Initialize and fill in a dynamic programming matrix by hand ( e.g. , try reproducing the example from my slides on paper). 2 Write a function to create the dynamic programming matrix and initialize the first row and column. 3 Write a function to fill in the rest of the matrix 4 Rewrite the initialize and fill steps to store pointers to the best sub-solution for each cell. 5 Write a backtrace function to read the optimal alignment from the filled in matrix. If that isn’t enough to keep you occupied, try implementing Smith-Waterman local alignment and/or non-zero gap opening penalties. Mark Voorhies Practical Bioinformatics

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend