 
              1 Quantitative and Computational skills 58I Lab and Prof Skills II
2 Overview Overview of Q&C strand Approach: what are Q&C / Data skills How it fits with stage 1: Stage 1 and 2 Roadmaps Fit with other strands Stage 1 Revision Overview of the session content Why script? Why R Independent study ideas
3 Overview of Q&C strand and fit with ED and BT Exposure to, and practice in, a variety of techniques Can’t cover everything you’ll ever need but topics are: Foundational Applicable to all Universal => in this module Highly transferable and beyond -> option leaders for option specific analyses
4 Learning Objectives To be able to generate a testable hypothesis . 1. To design and conduct experiments to test this hypothesis, with appropriate controls. 2. To have practical experience of a range of techniques relevant to the discipline. 3. 4. To work effectively within a team. To be able to write a scientific report based on practical work. 5. To communicate scientific information and ideas in the form of a variety of media to a 6. variety of audiences. 7. To use appropriate graphical methods to produce data figures with appropriately detailed legends. 8. To use relevant statistical or other analytical methods to analyse data . 9. To research scientific literature in a given area, and write an extended and well-structured account.
5 What are Data Skills: actions with data Simulate Explore Transform Tidy (the mental model and the activity) Model ‘statistics’ ~15% Import Report
6 Reproducible actions with data Reproducibly Simulate Explore Transform Tidy Model Import Report
7 ROADMAP: Stage 1 MLO Introductory Simple plots: histograms Little: ranking Everything scripted Normality testing Abstraction Code commenting Summary stats Organisation of analysis Reproducibly What ‘tidy’ data are Simulate Explore Fundamental but little tidying. concepts in Transform hypothesis testing Changing variable Tidy CI, Linear models names and types ( t -tests, ANOVA, Factor levels regression), chi-sq, Wide to long Model Mann-Whitney reshaping Wilcoxon, Kruskal Import From files - all but Report Wallis, correlation unusually complex .txt, .xlsx, .csv, .sav, Multiple comparison .dta “significance, direction, Selection: magnitude” Relative paths Assumptions Figures: legends, saving Separators Not really fit Not fully reproducibly ..and more
8 Stage 1 Aut: numeracy, early data skills Spr: main teaching block, primarily data analysis Sum: reinforcement and development
9 Stage 2
10 Proportions Introductory Z score standardisation Stage 2 MLO Coefficient of variation Intermediate Log to base 2 Subtraction of noise/background Abstraction Scaling/reversing experimental steps Running and PCR Relative quantification interpreting particular RPKM quantification models Reproducibly Simulate Explore Stage 1 tests in LM Identification and framework (increased Transform removal of outliers Tidy conceptual complexity) and NA More LM GLM - Binomial and Model Poisson Odds ratios Import Report Deviance measures of fit More on Multiple comparisons Identifying Non-linear regression non-independence ~Mixed models Multi panel figures and pseudo FDR Complex domain specific viz replication in GWAS Volcano plots experimental design
Emma Rand Stage 4 Stage 2 Stage 1 Introductory Intermediate Advanced Reproducibly Reproducibly Reproducibly Reproducibly Reproducibly Reproducibly Reproducibly Simulate Simulate Simulate Simulate Simulate Simulate Simulate Explore Explore Explore Explore Explore Explore Explore Transform Transform Transform Transform Transform Transform Transform Tidy Tidy Tidy Tidy Tidy Tidy Tidy Model Model Model Model Model Model Model Import Import Import Import Import Import Import Report Report Report Report Report Report Report 11
12 Stage 1 Revision: experiments and analysis Some things we control, Something we measure Can be explained by choose or set Response variable Predictor variables Relationship Dependent variable Independent variable(s) The ‘y’ s The ‘x’ s function(y ~ x) function(y ~ x 1 * x 2 )
13 Stage 1 Revision: Choice of analysis (test)? can be explained by Type of Response Predictor variables variable What relationship links Number: one? or more? the predictors to the Type: continuous? Continuous or discrete? If response? Linear? categories? continuous, normal? function(y ~ x) function(y ~ x 1 * x 2 )
14 Revision: Stage 1 analyses (tests) can be explained by Type of Response Predictor variables variable What relationship links Number: one or two the predictors to the Continuous: regression Continuous or discrete? If response? Linear? Categories: t-tests and continuous, normal? ANOVA function(y ~ x) function(y ~ x 1 * x 2 )
15 Why those analyses?? From Bolker, 2007
16 Overview of Session content Autumn Spring Experimental Design Bioscience techniques Quant & Comp 1hr lecture - Introduction (ER) 2hr Workshop - Revision and thinking about 2hr Workshop - Problems in time (JWP) analysis before experimental design (ER) 2hr Drop-ins for each Bioscience Technique 2x 2hr Workshop - Data analysis - Building from strand linear models to Generalised linear models (ER) 2hr Workshop - Visualising data (ER) 2hr Drop-in
17 Approach A bit different from last year: No lectures! Independent study in the form of Prior learning: Slides + Short recordings Workshop: Workbook - You are not expected do all of the workbook examples. Choose the examples from each section that best match your biological interests. Independent study Practice, anything! More advanced examples, Other workshop examples? Examples from last year? Rloggers?
18 W01: Revision and thinking about analysis before experimental design Some things we Something can be explained by control, choose or set we measure
19 W01: Revision and thinking about analysis before experimental design Something we measure….Think about data in the broadest sense: how do we ensure we get ‘good’ data? What and how to measure Reliability Precision Transformations, normalisations Do you need statistics? Limits of interpretation Independence of data points
20 From Bolker, 2007
21 W01: Revision and thinking about analysis before experimental design By following the slides and applying the techniques to select examples from the workbook the successful student will be able to: ● Recognise non-independence and pseudoreplication ● Select appropriately, and apply some methods to make data comparable ● Design experiments to take account of these Slides: outline concepts about experimental design, non-independence, pseudoreplication and data comparability Workbook: practice in recognising non-independence and pseudoreplication and in applying ‘normalisation’ Experimental Design and Bioscience Techniques: practice in designing experiments and analysing and presenting results
22 W02: Building from Linear Models to Generalised Linear Models Part 1 t-tests, ANOVA and regression are Linear Models! Revisit tests in the framework of the General Linear Model. More extendable We will learn to apply and interpret the lm() function. By following the slides and applying the techniques to select examples from the workbook the successful student will be able to: ● Explain the the link between t-tests, ANOVA and regression ● Appropriately apply linear models using lm() ● Interpret the results using summary() and anova() and relate them to the outputs of t.test() and aov()
23 W03: Building from Linear Models to Generalised Linear Models Part 2 We will learn to apply and interpret the glm() function for when your response variable is not continuous but a count or a binary outcome. By following the slides and applying the techniques to select examples from the workbook the successful student will be able to: ● Explain the link between the general linear models and the generalised linear model ● Recognise where a generalised linear model would be appropriate and apply glm() ● Determine which effects are significant using using summary() and anova()
24 The rationale for scripting analysis Experiments (tests of ideas) Experimental design Interpret and report Explanatory Response Analyse variables variables Visualise Choose / set / manipulate measure Reproducibly: protocol, lab book Reproducibly: scripting
25 Why R? ● R caters to users who do not see themselves as programmers, but then allows them to slide gradually into programming ● Community ● Language for data analysis ● Open source, Free, ● Graphics ● Reproducibility
26 Assessment Opportunities to express competency in Experimental Design and Bioscience Techniques (and elsewhere) Becoming Competent Make it fun. Practice and engage with people. The workshops are not a test. It is expected that you make a lot of mistakes and need help. Talk to each other, demonstrators and lecturers. “ There are two ways to write error free code and only the third way works” You can optionally stretch yourself by asking questions in class, creating additional figures, or doing ‘More advanced examples’ in some cases
Recommend
More recommend