summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview - PowerPoint PPT Presentation

summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview Estimating smooths How wiggly are things? Measuring wigglyness Splines What about these "s" things? Smoothing Translating maths into R Building a model, from scratch Smooths Negative binomial distribution Tweedie distribution Count distributions Getting more out of GAMs Your �rst DSM Building a model, from scratch Bivariate terms Comparing bivariate and additive models Plotting Building a model, from scratch Bivariate spatial term Response Plotting Adding a term Plotting plot(dsm_xy_tw, pages=1) library (dsm) dsm_xyb_tw <- dsm(count ~ s(x, y), Assumed an additive structure The count model, from scratch Set basis complexity or "size" We think underlying phenomenon is smooth Visually: Know count in segment Just use + ## ## ## 𝑜 𝑘 𝑜 𝑘 = = 𝐵 𝑘 𝑞 ̂ 𝐵 𝑘 𝑞 ̂ 𝑜 𝑘 = exp [ exp [ 𝐵 𝑘 𝑞 ̂ 𝛾 0 𝛾 0 exp [ + 𝑡 ( + 𝑡 ( 𝛾 0 y 𝑘 y 𝑘 ) + 𝑡 ( ) + 𝑡 ( + 𝑡 ( y 𝑘 Depth 𝑘 Depth 𝑘 ) ] + 𝜗 𝑘 ) ] + ) ] + 𝜗 𝑘 𝜗 𝑘 dsm_x_tw <- dsm(count~s(x), ddf.obj=df, ddf.obj=df, 𝑜 𝑘 𝑘 𝑙 plot(dsm_xyb_tw, select=1, What about area and detectability? We set: It's a statistical model so: 𝑘 𝑘 𝑘 I can't teach you all of Functions made of other, Think = smooth plot(dsm_x_tw) Response is a count ## Family: Tweedie(p=1.306) ## Family: Tweedie(p=1.326) ## Family: Tweedie(p=1.29) Var (count) = 𝜚 𝔽 (count) 𝑟 "Abundance is a smooth function of depth" Lots of wiggles not smooth segment.data=segs, segment.data=segs, observation.data=obs, scheme=2, asp=1) Fitted smooths have effective degrees of freedom (EDF) What is a GAM? No interaction ## Link function: log ## Link function: log ## Link function: log Var (count) = 𝑡 Want : GAMs in 1 week simpler functions ⇒ observation.data=obs, family=tw()) "type": bases (made up ## ## ## 𝔽 (count) + 𝜆 𝔽 (count) 2 Often, it's mostly zero Want a line that is "close" Poisson is dsm_xy_tw <- dsm(count ~ s(x) + s(y), Dashed lines indicate +/- 2 family=tw()) where where Straight line are some errors, very smooth count distribution ## Formula: ## Formula: ## Formula: of basis functions ) What is smoothing? Set "large enough" 1, 2 or more dimensions We can specify s(x,y) (and s(x,y,z,...) ) ddf.obj=df, 𝑟 = 1 to all the data Good intro book Basis functions , standard errors 𝑜 𝑘 𝑜 𝑘 = 𝐵 𝑘 𝑞 ̂ = 𝐵 𝑘 𝑞 ̂ exp [ exp [ 𝛾 0 𝛾 0 + 𝑡 ( + 𝑡 ( y 𝑘 ) + 𝑡 ( y 𝑘 Estimate ) + 𝑡 ( Depth 𝑘 Depth 𝑘 ) ] + ) ] 𝜗 𝑘 𝜗 𝑘 𝑜 𝑘 ∼ count distribution ⇒ 𝑜 𝑘 ∼ On link scale ## count ~ s(x) + s(y) + offset(off.set) ## count ~ s(x, y) + offset(off.set) ## count ~ s(x) + offset(off.set) 𝑘 𝑘 segment.data=segs, dsm is based on mgcv by Simon Wood 𝑙 mean variance We estimate and 𝑐 𝑙 estimate 𝜆 ## ## ## 𝑜 𝑘 = 𝑔 ([environmental covariates ) ] 𝑘 observation.data=obs, "maximum Fitting GAMs using dsm ≠ 𝑟 𝜚 (also a good textbook on Balance between Rug plot inside the link: formula=count ~ s(y) (Poisson: 𝛾 𝑙 ## Parametric coefficients: ## Parametric coefficients: ## Parametric coefficients: scheme=2 makes (Poisson isn't good at family=tw()) wigglyness": basis size Lecture 2 : Generalized Additive Models Lecture 2 : Generalized Additive Models That's a Generalized Additive Model! That's a Generalized Additive Model! Now let's look at each bit... Now let's look at each bit... Fitting GAMs using dsm Fitting GAMs using dsm What is smoothing? What is smoothing? Let's have a go... Let's have a go... ## Estimate Std. Error t value Pr(>|t|) ## Estimate Std. Error t value Pr(>|t|) ## Estimate Std. Error t value Pr(>|t|) interpolation and "fit" GLMs and GLMMs) Additive model of smooths : ) this) heatmap has a distribution (count) area of segment - "offset" ∑ 𝐿 (sometimes: ## (Intercept) -20.0908 0.2381 -84.39 <2e-16 *** ## (Intercept) -19.8115 0.2277 -87.01 <2e-16 *** ## (Intercept) -20.2745 0.2477 -81.85 <2e-16 *** On the link scale response distribution: family=nb() or family=tw() 𝑡 Var (count) = 𝔽 (count) 𝑡 ( 𝑦 ) = 𝑙 =1 𝛾 𝑙 𝑐 𝑙 ( 𝑦 ) 𝑜 𝑘 𝐵 𝑘 ## --- ## --- ## --- Quite technical in places dimension/complexity) (set too.far to exclude ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 EDF on axis detectability: ddf.obj=df_hr ## ## ## are residuals (differences between model and probability of detection in segment points far from data) 𝑧 More resources on course 𝑜 𝑘 = exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] Automatically estimate: ## Approximate significance of smooth terms: ## Approximate significance of smooth terms: ## Approximate significance of smooth terms: 𝑞 ̂ 𝜗 𝑘 𝑘 observations) website offset, data: segment.data=segs, ## edf Ref.df F p-value ## edf Ref.df F p-value ## edf Ref.df F p-value "how wiggly it needs ## s(x) 4.943 6.057 3.224 0.004239 ** ## s(x) 4.962 6.047 6.403 1.07e-06 *** ## s(x,y) 16.89 21.12 4.333 3.73e-10 *** observation.data=obs model terms ## --- ## s(y) 5.293 6.419 4.034 0.000322 *** ## --- to be": smoothing ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 is the link function (NB there is a point mass at parameter(s) ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## exp zero not plotted) ## ## R-sq.(adj) = 0.102 Deviance explained = 34.7% ## R-sq.(adj) = 0.0283 Deviance explained = 17.9% ## -REML = 394.86 Scale est. = 4.8248 n = 949 ## -REML = 409.94 Scale est. = 6.0413 n = 949 ## R-sq.(adj) = 0.0678 Deviance explained = 27.4% ## -REML = 399.84 Scale est. = 5.3157 n = 949 21 / 34 21 / 34 34 / 34 34 / 34 19 / 34 33 / 34 32 / 34 31 / 34 17 / 34 18 / 34 12 / 34 30 / 34 22 / 34 20 / 34 27 / 34 28 / 34 16 / 34 10 / 34 29 / 34 14 / 34 25 / 34 14 / 34 13 / 34 26 / 34 11 / 34 15 / 34 23 / 34 24 / 34 3 / 34 1 / 34 9 / 34 8 / 34 7 / 34 4 / 34 2 / 34 7 / 34 6 / 34 6 / 34 5 / 34 1 / 34

Overview The count model, from scratch What is a GAM? What is smoothing? Fitting GAMs using dsm 2 / 34

Building a model, from scratch Know count in segment 𝑜 𝑘 𝑘 Want : 𝑜 𝑘 = 𝑔 ([environmental covariates ) ] 𝑘 Additive model of smooths : 𝑡 𝑜 𝑘 = exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] model terms is the link function exp 3 / 34

Building a model, from scratch What about area and detectability? 𝑜 𝑘 = 𝐵 𝑘 𝑞 ̂ exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] 𝑘 area of segment - "offset" 𝐵 𝑘 probability of detection in segment 𝑞 ̂ 𝑘 4 / 34

Building a model, from scratch It's a statistical model so: 𝑜 𝑘 = 𝐵 𝑘 𝑞 ̂ exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] + 𝜗 𝑘 𝑘 has a distribution (count) 𝑜 𝑘 are residuals (differences between model and 𝜗 𝑘 observations) 5 / 34

That's a Generalized Additive Model! That's a Generalized Additive Model! 6 / 34 6 / 34

Now let's look at each bit... Now let's look at each bit... 7 / 34 7 / 34

Response 𝑜 𝑘 = 𝐵 𝑘 𝑞 ̂ exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] + 𝜗 𝑘 𝑘 where 𝑜 𝑘 ∼ count distribution 8 / 34

Count distributions Response is a count Often, it's mostly zero mean variance ≠ (Poisson isn't good at this) 9 / 34

Tweedie distribution Var (count) = 𝜚 𝔽 (count) 𝑟 Poisson is 𝑟 = 1 We estimate and 𝑟 𝜚 (NB there is a point mass at zero not plotted) 10 / 34

Negative binomial distribution Var (count) = 𝔽 (count) + 𝜆 𝔽 (count) 2 Estimate 𝜆 (Poisson: ) Var (count) = 𝔽 (count) 11 / 34

Smooths 𝑜 𝑘 = 𝐵 𝑘 𝑞 ̂ exp [ 𝛾 0 + 𝑡 ( y 𝑘 ) + 𝑡 ( Depth 𝑘 ) ] + 𝜗 𝑘 𝑘 12 / 34

What about these "s" things? Think = smooth 𝑡 Want a line that is "close" to all the data Balance between interpolation and "fit" 13 / 34

What is smoothing? What is smoothing? 14 / 34 14 / 34

Smoothing We think underlying phenomenon is smooth "Abundance is a smooth function of depth" 1, 2 or more dimensions 15 / 34

Estimating smooths We set: "type": bases (made up of basis functions ) "maximum wigglyness": basis size (sometimes: dimension/complexity) Automatically estimate: "how wiggly it needs to be": smoothing parameter(s) 16 / 34

Splines Functions made of other, simpler functions Basis functions , 𝑐 𝑙 estimate 𝛾 𝑙 ∑ 𝐿 𝑡 ( 𝑦 ) = 𝑙 =1 𝛾 𝑙 𝑐 𝑙 ( 𝑦 ) 17 / 34

Measuring wigglyness Visually: Lots of wiggles not smooth ⇒ Straight line very smooth ⇒ 18 / 34

How wiggly are things? Set basis complexity or "size" 𝑙 Fitted smooths have effective degrees of freedom (EDF) Set "large enough" 𝑙 19 / 34

summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview - PowerPoint PPT Presentation

summary(dsm_x_tw) summary(dsm_xyb_tw) summary(dsm_xy_tw) Overview Estimating smooths How wiggly are things? Measuring wigglyness Splines What about these "s" things? Smoothing Translating maths into R Building a model, from

Some Highlights of DSM-5 Jan Fawcett, MD Conflicts of Interest: More Enjoyment Than DSM-5

But heres the thing $XYB Growing at XY% YoY.. The hearing aid market is massive $

J ava/ DSM A Plat f orm f or Het erogeneous Comput ing W. Yu, A. Cox Depar t ment of Comput

Applying TDD for Creating DSM solutions: demo Juha-Pekka Tolvanen 30 October, 2016 DSM

Q1-2015 COLORADO May 6, 2015 DSM ROUNDTABLE AGENDA 1:00 1:05 p.m. Welcome and DSM

Colorado DSM Roundtable Colorado DSM Roundtable August 21, 2013 1:00 4:00 pm 1800 Larimer

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

1 2 Qualitative impairment in social interaction a)Marked impairment in the use of multiple

DSM Venturing Corporate VC of DS M DS M Venturing Corporate VC of DS M DSM company

P2P, DSM, and Other Products P2P, DSM, and Other Products from the Complexity Factory from the

Next Generation NAS Evolution of DSM Platform Jeremie Francois Synology Deep Machine Learning

DSM and Evonik establish joint venture for omega-3 fatty acids from natural marine algae for

DSM Gut Health School 15-16 March, 2018 Copenhagen Use of biomarkers in relation to

Xcel Energy Colorado DSM Roundtable Discussion December 1, 2011 1:00pm to 4:00pm 1800 Larimer,

Driving Profitable Growth GERALDINE MATCHETT - CFO ROYAL DSM CAPITAL MARKETS DAY 4 NOVEMBER 2015

Distributed Shared Memory (DSM) Robert Gasparyan, Angela Gong, Judson Wilson CS 240, Spring 2015

Observation Core components Data Model VO Data Modeling Working Group Mireille Louys, CDS

Star formation in alternative dark matter dwarfs: then and now Mark R. Lovell 1,2 , Jess

Security proof of practical quantum key distribution with detection-efficiency mismatch Yanbao

Course on Inverse Problems Albert Tarantola Lesson XII: Optimization and Linear Problems Recall:

1 Optimization in decision graphs The repeated milk test problem Unfolding to decision tree The

Software Engineering I (02161) Design by Contract Assoc. Prof. Hubert Baumeister DTU Compute

Lectures on Economic Inequality Warwick, Summer 2018, Supplement to Slides 4 Debraj Ray

Beyond Outerplanarity Steven Chaplick , Myroslav Kryven , Giuseppe Liotta , offler