Quantitative and Computational skills
58I Lab and Prof Skills II
1
Quantitative and Computational skills 58I Lab and Prof Skills II 2 - - PowerPoint PPT Presentation
1 Quantitative and Computational skills 58I Lab and Prof Skills II 2 Overview Overview of Q&C strand Approach: what are Q&C / Data skills How it fits with stage 1: Stage 1 and 2 Roadmaps Fit with other strands Stage 1 Revision
1
Overview of Q&C strand Approach: what are Q&C / Data skills How it fits with stage 1: Stage 1 and 2 Roadmaps Fit with other strands Stage 1 Revision Overview of the session content Why script? Why R Independent study ideas
2
Exposure to, and practice in, a variety of techniques Can’t cover everything you’ll ever need but topics are: Foundational Applicable to all Universal => in this module Highly transferable and beyond
3
1. To be able to generate a testable hypothesis. 2.
To design and conduct experiments to test this hypothesis, with appropriate controls.
3. To have practical experience of a range of techniques relevant to the discipline. 4. To work effectively within a team. 5. To be able to write a scientific report based on practical work. 6. To communicate scientific information and ideas in the form of a variety of media to a variety of audiences.
7. To use appropriate graphical methods to produce data figures with appropriately detailed legends. 8. To use relevant statistical or other analytical methods to analyse data.
9. To research scientific literature in a given area, and write an extended and well-structured account.
4
5
Tidy
(the mental model and the activity)
Import Transform Explore Model
‘statistics’ ~15%
Report Simulate
6
Reproducibly
Tidy Import Transform Explore Model Report Simulate
7
Reproducibly
Tidy Import Transform Explore Model Report
From files - all but unusually complex .txt, .xlsx, .csv, .sav, .dta Relative paths Separators ..and more Everything scripted Code commenting Organisation of analysis What ‘tidy’ data are but little tidying. Changing variable names and types Factor levels Wide to long reshaping Simple plots: histograms Normality testing Summary stats Fundamental concepts in hypothesis testing CI, Linear models (t-tests, ANOVA, regression), chi-sq, Mann-Whitney Wilcoxon, Kruskal Wallis, correlation Multiple comparison Selection: Assumptions Not really fit “significance, direction, magnitude” Figures: legends, saving Not fully reproducibly Little: ranking
Introductory Simulate
Abstraction
8
Aut: numeracy, early data skills Spr: main teaching block, primarily data analysis Sum: reinforcement and development
9
10
Reproducibly
Tidy Import Transform Explore Model Report
Identification and removal of outliers and NA Stage 1 tests in LM framework (increased conceptual complexity) More LM GLM - Binomial and Poisson Odds ratios Deviance measures of fit More on Multiple comparisons Non-linear regression ~Mixed models FDR GWAS Proportions Z score standardisation Coefficient of variation Log to base 2 Subtraction of noise/background Scaling/reversing experimental steps PCR Relative quantification RPKM quantification Identifying non-independence and pseudo replication in experimental design Multi panel figures Complex domain specific viz Volcano plots
Introductory Intermediate Simulate
Abstraction Running and interpreting particular models
Emma Rand
Reproducibly
Tidy Import Transform Explore Model Report
11
Introductory Intermediate Advanced Simulate
Reproducibly
Tidy Import Transform Explore Model Report Simulate
Reproducibly Reproducibly
Tidy Import Transform Explore Model Report Simulate Tidy Import Transform Explore Model Report Simulate
Reproducibly Reproducibly
Tidy Import Transform Explore Model Report Simulate Tidy Import Transform Explore Model Report Simulate
Reproducibly
Tidy Import Transform Explore Model Report Simulate
Something we measure Some things we control, choose or set
Relationship
12
Response variable
Dependent variable The ‘y’ s
Predictor variables
Independent variable(s) The ‘x’ s Can be explained by
can be explained by
What relationship links the predictors to the response? Linear?
13
Type of Response variable
Continuous or discrete? If continuous, normal?
Predictor variables
Number: one? or more? Type: continuous? categories?
can be explained by
What relationship links the predictors to the response? Linear?
14
Type of Response variable
Continuous or discrete? If continuous, normal?
Predictor variables
Number: one or two Continuous: regression Categories: t-tests and ANOVA
From Bolker, 2007
15
Why those analyses??
Spring Quant & Comp Experimental Design
1hr lecture - Introduction (ER) 2hr Workshop - Revision and thinking about analysis before experimental design (ER) 2x 2hr Workshop - Data analysis - Building from linear models to Generalised linear models (ER) 2hr Workshop - Visualising data (ER) 2hr Drop-in 2hr Workshop - Problems in time (JWP) 2hr Drop-ins for each Bioscience Technique strand
Autumn Bioscience techniques
16
17
A bit different from last year: No lectures!
Independent study in the form of Prior learning: Slides + Short recordings Workshop: Workbook - You are not expected do all of the workbook examples. Choose the examples from each section that best match your biological interests. Independent study Practice, anything! More advanced examples, Other workshop examples? Examples from last year? Rloggers?
18
Some things we control, choose or set
can be explained by
Something we measure
Something we measure….Think about data in the broadest sense: how do we ensure we get ‘good’ data?
What and how to measure Reliability Precision Transformations, normalisations Do you need statistics? Limits of interpretation Independence of data points
19
From Bolker, 2007
20
By following the slides and applying the techniques to select examples from the workbook the successful student will be able to:
Slides: outline concepts about experimental design, non-independence, pseudoreplication and data comparability Workbook: practice in recognising non-independence and pseudoreplication and in applying ‘normalisation’ Experimental Design and Bioscience Techniques: practice in designing experiments and analysing and presenting results
21
t-tests, ANOVA and regression are Linear Models! Revisit tests in the framework of the General Linear Model. More extendable We will learn to apply and interpret the lm() function. By following the slides and applying the techniques to select examples from the workbook the successful student will be able to:
22
We will learn to apply and interpret the glm() function for when your response variable is not continuous but a count or a binary outcome. By following the slides and applying the techniques to select examples from the workbook the successful student will be able to:
23
24
Reproducibly: scripting Reproducibly: protocol, lab book
Explanatory variables
Choose / set / manipulate
Experiments
(tests of ideas)
Response variables
measure
Experimental design Analyse Visualise Interpret and report
25
allows them to slide gradually into programming
Opportunities to express competency in Experimental Design and Bioscience Techniques (and elsewhere)
Make it fun. Practice and engage with people. The workshops are not a test. It is expected that you make a lot of mistakes and need help. Talk to each other, demonstrators and lecturers. “There are two ways to write error free code and only the third way works” You can optionally stretch yourself by asking questions in class, creating additional figures, or doing ‘More advanced examples’ in some cases
26
Revise: 17C and 8C VLE or http://www-users.york.ac.uk/~er13/ Variable types, data structures Revision lecture L10 notes: any familiarisation will help Play: Datacamp Put R and RStudio on your own pc/mac #biol58I RBloggers https://buzzrbeeline.blog/
27
28