statistical computing biostatistics 615 815
play

Statistical Computing Biostatistics 615/815 . . . . . . . - PowerPoint PPT Presentation

. Summary Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Januray 6th, 2011 Hyun Min Kang Statistical Computing Biostatistics 615/815 . . . . . . . Implementation . Overview . . . . . . Syllabus Algorithms


  1. . Summary Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Januray 6th, 2011 Hyun Min Kang Statistical Computing Biostatistics 615/815 . . . . . . . Implementation . Overview . . . . . . Syllabus Algorithms Sorting Recursion 1 / 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  2. . . Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Computational improvement of existing methods Approximation algorithms for computationally intractable problems. Developing algorithmic perspective for improving analytic methods. Implement one’s own library / routine when necessary Make use of existing libraries when useful. Determine appropriate data structure for implmentation Learning practical skills for efficient implementation of methods. Objectives Summary Implementation Recursion Syllabus . . . . . . Overview 2 / 35 Algorithms Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Understanding computational aspects of statistical methods. � Estimate computational time and memory required � Understand how the method scales with data size

  3. . Sorting Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Computational improvement of existing methods Approximation algorithms for computationally intractable problems. Developing algorithmic perspective for improving analytic methods. Objectives Summary . Recursion Implementation 2 / 35 Overview Algorithms . . Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Understanding computational aspects of statistical methods. � Estimate computational time and memory required � Understand how the method scales with data size • Learning practical skills for efficient implementation of methods. � Determine appropriate data structure for implmentation � Make use of existing libraries when useful. � Implement one’s own library / routine when necessary

  4. . Algorithms Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Objectives Summary . Recursion Sorting Implementation 2 / 35 . Syllabus . . Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Understanding computational aspects of statistical methods. � Estimate computational time and memory required � Understand how the method scales with data size • Learning practical skills for efficient implementation of methods. � Determine appropriate data structure for implmentation � Make use of existing libraries when useful. � Implement one’s own library / routine when necessary • Developing algorithmic perspective for improving analytic methods. � Approximation algorithms for computationally intractable problems. � Computational improvement of existing methods

  5. . Many statistical methods require “optimization” or “randomization” Why Study Statistical “Computing”? Computational efficiency is critical for large-scale data analysis In genomic data analysis, more accurate methods are often not used in practice due to prohibitive computational cost. Many algorithms works “in principle”, but almost impossible to run with large-scale data due to exponential time complexity with data size. Logistic regression . Maximum-likelihood estimation Bootstrapping Markov-chain Monte Carlo (MCMC) methods Hyun Min Kang Biostatistics 615/815 - Lecture 1 Januray 6th, 2011 Summary 3 / 35 Implementation Overview Recursion . . . Sorting . Algorithms . . Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Statistical methods need to “compute” from data. � Need to understand computation for better interpretation of the results.

  6. . Implementation Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang Markov-chain Monte Carlo (MCMC) methods Bootstrapping Maximum-likelihood estimation Logistic regression Many statistical methods require “optimization” or “randomization” with large-scale data due to exponential time complexity with data size. practice due to prohibitive computational cost. Why Study Statistical “Computing”? Summary . 3 / 35 . . Sorting . . . . Algorithms Recursion Syllabus Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Statistical methods need to “compute” from data. � Need to understand computation for better interpretation of the results. • Computational efficiency is critical for large-scale data analysis � In genomic data analysis, more accurate methods are often not used in � Many algorithms works “in principle”, but almost impossible to run

  7. . Sorting Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang with large-scale data due to exponential time complexity with data size. practice due to prohibitive computational cost. Why Study Statistical “Computing”? Summary . Implementation Recursion 3 / 35 . . . Algorithms . Syllabus Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Statistical methods need to “compute” from data. � Need to understand computation for better interpretation of the results. • Computational efficiency is critical for large-scale data analysis � In genomic data analysis, more accurate methods are often not used in � Many algorithms works “in principle”, but almost impossible to run • Many statistical methods require “optimization” or “randomization” � Logistic regression � Maximum-likelihood estimation � Bootstrapping � Markov-chain Monte Carlo (MCMC) methods

  8. . Summary Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang . . . . . . . . 1. Algorithms 101 . . What Will Be Covered? Implementation Syllabus . . . . Overview . . Algorithms Sorting Recursion 4 / 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Computational Time Complexity • Sorting • Divide and Conquer Algorithms • Searching • Key Data Stucture • Dynamic Programming

  9. . . Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang . . . . . . . . 2. Matrices and Numerical Methods . What Will Be Covered? Summary Implementation Syllabus . . . . . . Overview 5 / 35 Algorithms Sorting Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Matrix decomposition (LU, QR, SVD) • Implementation of Linear Models • Numerical optimizations

  10. . . Januray 6th, 2011 Biostatistics 615/815 - Lecture 1 Hyun Min Kang . . . . . . . . 3. Advanced Statistical Methods . What Will Be Covered? Summary Implementation Syllabus . . . . . . Overview 6 / 35 Algorithms Sorting Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Hidden Markov Models • Expectation-Maximization • Markov-Chain Monte Carlo (MCMC) Methods

  11. . . . . . . Optional Textbooks . . . . . . . . “Numerical Recipes” by Press, Teukolsky, Vetterling, and Flannery Third Edition, Cambridge University Press, 2007 “C++ Primer Plus” by Stephen Prata Fifth Edition, Sams, 2004 Hyun Min Kang Biostatistics 615/815 - Lecture 1 Januray 6th, 2011 . . . Recursion . . . . . . Overview Syllabus Algorithms Sorting . 7 / 35 Implementation Summary Textbooks . Required Textbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • “Introduction to Algorithms” � by Cormen, Leiserson, Rivest, and Stein (CLRS) � Third Edition, MIT Press, 2009

  12. . . . . . . . . . Optional Textbooks . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 1 Januray 6th, 2011 . . Required Textbook Sorting . . . . . . Overview Syllabus . Algorithms 7 / 35 Implementation Textbooks Summary Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • “Introduction to Algorithms” � by Cormen, Leiserson, Rivest, and Stein (CLRS) � Third Edition, MIT Press, 2009 • “Numerical Recipes” � by Press, Teukolsky, Vetterling, and Flannery � Third Edition, Cambridge University Press, 2007 • “C++ Primer Plus” � by Stephen Prata � Fifth Edition, Sams, 2004

  13. . . . . . . . BIOSTAT815 . . . . . . . . Weekly Assignments - 33% Midterm Exam - 14% Final Exam - 20% Projects, to be completed in pairs - 33% Hyun Min Kang Biostatistics 615/815 - Lecture 1 Januray 6th, 2011 . . . . . . . . . . Overview Syllabus Algorithms Sorting 8 / 35 Recursion Implementation Summary Assignments . BIOSTAT615 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Weekly Assignments - 50% • Midterm Exam - 20% • Final Exam - 30%

  14. . . . . . . . . . BIOSTAT815 . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 1 Januray 6th, 2011 . . BIOSTAT615 Sorting . . . . . . Overview Syllabus . Algorithms 8 / 35 Recursion Assignments Summary Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Weekly Assignments - 50% • Midterm Exam - 20% • Final Exam - 30% • Weekly Assignments - 33% • Midterm Exam - 14% • Final Exam - 20% • Projects, to be completed in pairs - 33%

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend