Advanced Methods in Applied Statistics Christian Starup & Loui - PowerPoint PPT Presentation

Apr 08, 2024 •305 likes •421 views

Advanced Methods in Applied Statistics Christian Starup & Loui Wentzel Niels Bohr Institute March 8, 2018 Journal Article https://doi.org/10.1093/bioinformatics/btw438 Problem Given a dataset where one needs to calculate several or many

Advanced Methods in Applied Statistics Christian Starup & Loui Wentzel Niels Bohr Institute March 8, 2018
Journal Article https://doi.org/10.1093/bioinformatics/btw438
Problem Given a dataset where one needs to calculate several or many p-values. Should one account for a possible correlation between data variables?
No Correlation solution If the P-values are not correlated, then according to H 0 the distribution of each P-value should be uniform, and the product of P-values should then be drawn from the distribution of N products of uniform numbers: � � ( − 1) N − 1 ( N − 1)! · ln( u ) N − 1 du P = (1) 0 This is equivalent to a χ 2 -test with 2 k degrees of freedom called Fishers Method: N � Ψ = − 2 log( P i ) (2) i =1 � ∞ χ 2 P = φ 2 k (Ψ) = 2 k ( x ) dx (3) Ψ
Correlation solution However, if the data is correlated, we can’t assume a uniform distribution of P-values. Brown therefore expanded Fisher’s method to include a re-scaling factor, c, such that Ψ ∼ c χ 2 2 f . f = E [Ψ] 2 c = Var [Ψ] 2 E [Ψ] = k � Var [Ψ] = 4 k + 2 cov ( W i , W j ) var [Ψ] f i < j With W i = − 2 log( P i ), E [Ψ] = 2 k (assuming a χ 2 distribution), k is the Fisher’s DoF and f the re-scaled Brown’s DoF. The combined P-value is then: P combined = 1 − Φ 2 f (Ψ / c ) with Ψ = � W i , Φ 2 k being the cumulative distribution function of χ 2 2 f .
Correlation solution continued The articles contribution to Browns’ method is to calculate the covariance matrix by an empirical approximation, thereby the Empirical Brown’s method (EBM): cov ( W i , W j ) ≈ cov ( w i , w j ) w i = − 2 log(1 − F ( − → x i )) Kost’s method uses another approach to calculate the covariance: cov ( W i , W j ) ≈ 3 . 263 ρ ij + 0 . 710 ρ 2 ij + 0 . 027 ρ 3 ij The EBM is a non-parametric approach, where F ( − → x i ) is the right-sided empirical cumulative distribution function.
Simulating data Parameters were µ i = 0, a = 0 . 8, n = 4. b j was randomly sampled from [ − 0 . 5; 0 . 5]. Each sample had 200 entries.  1 . . . . . .  b 2 b j b n b 2 1 . . . a . . . a     . . . . ... ... . . . .   . . . .   M = (4)   . . . 1 . . . b j a a     . . . . ... ...   . . . . . . . .     b n a . . . a . . . 1 From any sample � y drawn from this distribution, n -dimensional uniform noise from [ − 1; 1] was added: y + ξ� � x = � U (5) They draw numbers from one axis on the multivariate normal distribution (axis 1 with correlations b j to the others) and test the correlation to the other axes using Pearsons correlation test.
Ground Truth P-values To test the different tests against correlated data, it should yield the same results as if the data was uncorrelated. ◮ Shuffle � y 1 ◮ Calculate Ψ ∗ as earlier ◮ Repeat M times The ground truth P-value is then � M m =1 I (Ψ ∗ m ≥ Ψ) P ground = (6) M Notice this gives a resolution in the ground truth P-value by 1 / M .
Performance results as a function of Signal to Noise ratio

Recommend

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

Applied Statistics, IMath The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics Department of Mathematics Department of Computational Science Applied Statistics, IMath Contents Preamble Good

383 views • 34 slides

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official Statistics Official Statistics 3 Official Statistics National Statistics Official Statistics 4 2007 2009 2008 Official Statistics 5 2007 2009

262 views • 10 slides

Advanced Statistics Janette Walde janette.walde@uibk.ac.at Department of Statistics University

Advanced Statistics Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Janette Walde Advanced Statistics Introduction We are pattern-seeking story-telling animals. (Edward Leamer) Statistics does

975 views • 85 slides

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

DataCamp Spatial Statistics in R SPATIAL STATISTICS IN R Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders DataCamp Spatial Statistics in R US Congressional Borders DataCamp Spatial Statistics in

739 views • 42 slides

Power-Law Distributions in Empirical Data Article for Advanced Methods in Applied Statistics

Power-Law Distributions in Empirical Data Article for Advanced Methods in Applied Statistics Christian Anker Rosiek 8th March 2018 Christian Anker Rosiek Power-Law Distributions in Empirical Data 1 / 14 SIAM REVIEW ? 2009 Society for

261 views • 15 slides

Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 Randomization Model Population

Randomization Model Population Model Rank Tests Assignment Lecture 4: Permutation Methods Applied Statistics 2014 1 / 21 Randomization Model Population Model Rank Tests Assignment Permutation Methods Non-parametric methods for testing

351 views • 23 slides

Meshless Meshless Methods Meshless Meshless Methods Methods Methods Contents

Meshless Meshless Methods Meshless Meshless Methods Methods Methods Contents Introduction Mess Free Methods Element Free Galerkin Method Element Free Galerkin Method Moving Particle Semi-Implicit Method Conclusion C l i

562 views • 38 slides

Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are

ST 380 Probability and Statistics for the Physical Sciences Descriptive Statistics Descriptive and Inferential Statistics Recall that statistical methods are broadly divided into: Descriptive methods, which focus on the characteristics of a

287 views • 14 slides

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics Hypothesis Others Estimation Testing Outline of today Hypothesis testing for one population mean Hypothesis testing for two samples comparing

1.45k views • 109 slides

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 31, 2017 Applied Bayesian Statistics 1 Last edited September 8, 2017 by Earvin Balderama

505 views • 37 slides

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 2017 Applied Bayesian Statistics 1 Last edited August 21, 2017 by Earvin Balderama

560 views • 23 slides

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88 Presentation: Advanced Techniques 4/88 Presentation: Advanced Techniques 5/88 Presentation: Advanced Techniques 6/88 Presentation: Advanced Techniques

1.07k views • 90 slides

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online Training Course Week 5: Mind Body Connection Advanced Nutrition Course Advanced Nutrition Course Week 3 Recap Understanding emotions

670 views • 15 slides

Instrumental Variable Regression Erik Gahner Larsen Advanced applied statistics, 2015 1 / 58

Instrumental Variable Regression Erik Gahner Larsen Advanced applied statistics, 2015 1 / 58 Agenda Instrumental variable (IV) regression IV and LATE IV and regressions IV in STATA and R 2 / 58 IV between design and statistics

603 views • 58 slides

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase ROI 3 - Provide DATA and Reports Statistics Analysis Statistics Made Easy Statistics Made Easy Statistics Made Easy Statistics Made Easy

335 views • 18 slides

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning Department, Govt. of Maharashtra 1 Official Statistics Background Official Statistics is Statistics derived by the Government agencies from

348 views • 11 slides

Development of Verifjcatjon Methodology for Extreme Weather Forecasts Hong Guan 1 and Yuejian Zhu

Development of Verifjcatjon Methodology for Extreme Weather Forecasts Hong Guan 1 and Yuejian Zhu 2 1 SRG at EMC/ NOAA, 2 EMC/NOAA Present for 7 th Internatjonal Verifjcatjon Method Workshop May 8-11 2017 Berlin, Germany Highlights

357 views • 25 slides

Alfredo Ribeiro Federal University of Pernambuco IPCC Scenarios and Global Circulation Models

Alfredo Ribeiro Federal University of Pernambuco IPCC Scenarios and Global Circulation Models Bias Correction Water Balance Simulation Hydrological Model Simulation and Optimization Exercise Annual anomalies of global

1.07k views • 70 slides

Measuring the Quality of Credit Scoring Models Martin ez Dept. of Mathematics and

Measuring the Quality of Credit Scoring Models Martin ez Dept. of Mathematics and Statistics, Faculty of Science, Masaryk University CSCC XI, Edinburgh August 2009 Content 1. Introduction 3 2. Good/bad client definition 4 3.

475 views • 30 slides

Third Quarter Third Quarter 2014 Earnings Call 2014 Earnings Call Jeff Woodbury Vice President

Third Quarter Third Quarter 2014 Earnings Call 2014 Earnings Call Jeff Woodbury Vice President Investor Relations & Secretary October 31, 2014 Cautionary Statement Forward-Looking Statements. Outlooks, expectations, forecasts,

389 views • 22 slides

OpenProd Demonstration Video for Dynamic Maintenance Service Model Tero Jokinen VTT

OpenProd Demonstration Video for Dynamic Maintenance Service Model Tero Jokinen VTT Technical Research Centre of Finland 2 28/09/2012 SysDynTool demonstration Overall structure of the model is viewed on the main view Modules and

132 views • 11 slides

GFDR 2015 Lo Long-term Finance Cha hapt pter 3: Th The Use of Markets for Long ng-term

GFDR 2015 Lo Long-term Finance Cha hapt pter 3: Th The Use of Markets for Long ng-term Fina nanc nce GFDR SEMINAR SERIES FEBRUARY 24, 2015 Introduction Lack of long-termfinance: importantand challenging concernin many countries

1.04k views • 57 slides

An Approach to Human Reliability Analysis of SAMG Actions based on a Time Uncertainty Analysis

Transactions of the Korean Nuclear Society Virtual Spring Meeting July 9-10, 2020 An Approach to Human Reliability Analysis of SAMG Actions based on a Time Uncertainty Analysis Young A Suh 1 , Jaewhan Kim 1* 1 Risk and Reliability Assessment

312 views • 3 slides

Climate Sensitivity: Uncertainties & Learning Workshop on GHG Stabilization Scenarios Tsukuba

Climate Sensitivity: Uncertainties & Learning Workshop on GHG Stabilization Scenarios Tsukuba , Japan, 23 January 2004 Michael Schlesinger and Natasha Andronova Climate Research Group Department of Atmospheric Sciences University of

554 views • 52 slides