summary(dsm_ts_all) Each s() has its own options summary(dsm_all) - PowerPoint PPT Presentation

summary(dsm_ts_all) Each s() has its own options summary(dsm_all) Count model count~... Using reference bands Term selection Model with no shrinkage ... with shrinkage The story so far... p-values Goodness of �t Implications of Tobler's law Adding smooths EDF comparison Akaike's "An Information Criterion" Comparing models Path dependence is an issue here Sperm whale covariates Estimated abundance When to use each approach? Sperm whale response example (either) Term selection during �tting Life isn't that simple Recap Sperm whale response example Shrinkage approach Shrinkage example Model formulation Removing terms? abundance.est~... ( abundance.est ) 1. EDF tp ts dsm_ts_all <- dsm(count~s(x, y, bs="ts") + What is down to random variation? Test for zero effect of a smooth Usually have >1 option As for many other models, we can get an AIC from our Adding smooths Which enivronmental covariates? How GAMs work Practical choice Pure spatial, pure environmental, mixed? Q-Q plots Already know that + is our friend ( *** ), remove terms 1-by-1 (silly) Strategy: want all ## ## Effort is effective effort s(..., k=...) to adjust basis size Two popular approaches s(Depth, bs="ts") + Already selecting Basis s(..., bs="ts") - Detection covariate: 𝑞 ≈ 0 ## Family: Tweedie(p=1.25) ## Family: Tweedie(p=1.277) model Terms with EDF<1 may not be useful (can we remove?) s(x,y) 5.2245 1.8875 s(DistToCAS, bs="ts") + (using -values) Prior knowledge of biology/ecology of species Which response distribution? How to include detection info Where does the model actually fail? They are approximate for GAMs (but useful) Closer to the line is better 2 detection function covariate "levels" Path dependence How can we pick? ## Link function: log ## Link function: log Two different universes appear: Can build a big model... Beaufort wigglyness of terms thin plate splines with Response is count per segment s(SST, bs="ts") + Effort is area of each segment s(..., bs="...") for basis type 2. non-significant -value 𝑞 ## ## Tobler's �rst law of geography Tobler's �rst law of geography Comparison of AIC fine but : s(Depth) 3.5679 3.6794 s(EKE, bs="ts") + shrinkage "Observer"/"observation" -- change within segment Detection covariate: ## Formula: ## Formula: 𝑞 What are drivers of distribution? Simple spatial-only models Removing smooths Even if we have 1 model, is it any good? But what does "close" mean? Which response? Resampling the response, generate bands Reported in summary Stepwise selection - path (via a penalty) Changes at segment level Decide on a significance level and use that as a rule s(NPP, bs="ts"), ## count ~ s(x, y, bs = "ts") + s(Depth, bs = "ts") + s(DistToCAS, ## count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + Response is estimated abundance per segment lots more options (we'll see a few here) can't compare Tweedie (continuous) and negative group size ( size ) Lecture 3: Multivariate smoothing Lecture 3: Multivariate smoothing s(DistToCAS) 1.0001 0.0001 dsm_all <- dsm(count~s(x, y) + ddf.obj=df_hr, dependence ## bs = "ts") + s(SST, bs = "ts") + s(EKE, bs = "ts") + s(NPP, ## s(NPP) + offset(off.set) remove the wiggles then "Segment" -- change between segments -values What data is available? binomial (discrete) distributions! s(Depth) + What about using it to segment.data=segs, observation.data=obs, count or abundance.est ## ## bs = "ts") + offset(off.set) 𝑞 qq.gam(dsm_all, asp=1, main="Tweedie", (This can be subtle, more in model checking tomorrow!) "Everything is related to everything else, but near things "Everything is related to everything else, but near things How to select between possible models? s(SST) 5.9267 0.3827 remove the "linear" bits s(DistToCAS) + Changes at observation (In some sense leaving "shrunk" terms in is more "consistent" Now we have a huge model, what do we do? Now we have a huge model, what do we do? Selecting between response distributions Selecting between response distributions Which response type? Which response type? Comparing models Comparing models Adding covariates Adding covariates Recap Recap & family=tw()) ## ## Parametric coefficients: remove the whole term? cex=5, rep=100) shrinkage All possible subsets - s(SST) + "Count model" only lets us use segment-level covariates ( within distribution is fine) level ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) in terms of variance estimation, but can be computationally s(EKE) 1.7631 0.8196 are more related than distant things" are more related than distant things" s(EKE) + nullspace should be computationally expensive ## (Intercept) -20.6368 0.2751 -75 <2e-16 *** ## Estimate Std. Error t value Pr(>|t|) model selection model selection annoying) s(NPP), "Estimated abundance" lets us use either Comparing models shrunk less than the s(NPP) 2.3931 0.0004 (fishing?) ## --- ## (Intercept) -20.260 0.234 -86.59 <2e-16 *** abundance.est only ddf.obj=df_hr, ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 wiggly part AIC(dsm_all) segment.data=segs, observation.data=obs, Tobler (1970) Tobler (1970) ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## Comparing response distributions family=tw()) ## Approximate significance of smooth terms: ## ## edf Ref.df F p-value ## Approximate significance of smooth terms: ## [1] 1238.288 This isn't very satisfactory! ## s(x,y) 5.225 7.153 1.233 0.2920 ## edf Ref.df F p-value ## s(x,y) 1.8875209 29 0.705 4.33e-06 *** ## s(Depth) 3.568 4.439 6.641 1.82e-05 *** AIC(dsm_ts_all) ## s(DistToCAS) 1.000 1.000 1.504 0.2204 ## s(Depth) 3.6794182 9 4.811 < 2e-16 *** ## s(SST) 5.927 6.986 2.068 0.0407 * ## s(DistToCAS) 0.0000934 9 0.000 0.6797 ## s(SST) 0.3826654 9 0.063 0.2160 ## s(EKE) 1.763 2.225 2.579 0.0693 . ## [1] 1225.822 ## s(EKE) 0.8196256 9 0.499 0.0178 * ## s(NPP) 2.393 3.068 0.856 0.4678 ## s(NPP) 0.0003570 9 0.000 0.8372 ## --- 13 / 37 11 / 37 34 / 37 37 / 37 14 / 37 15 / 37 11 / 37 35 / 37 17 / 37 20 / 37 16 / 37 19 / 37 36 / 37 36 / 37 18 / 37 12 / 37 28 / 37 24 / 37 22 / 37 31 / 37 24 / 37 30 / 37 30 / 37 29 / 37 23 / 37 32 / 37 26 / 37 25 / 37 27 / 37 27 / 37 10 / 37 21 / 37 33 / 37 7 / 37 1 / 37 7 / 37 2 / 37 8 / 37 6 / 37 5 / 37 4 / 37 9 / 37 3 / 37 4 / 37 1 / 37 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 f

The story so far... How GAMs work How to include detection info Simple spatial-only models 2 / 37

Life isn't that simple Which enivronmental covariates? Which response distribution? Which response? How to select between possible models? 3 / 37

Adding covariates Adding covariates 4 / 37 4 / 37

Model formulation Pure spatial, pure environmental, mixed? Prior knowledge of biology/ecology of species What are drivers of distribution? What data is available? 5 / 37

Sperm whale covariates 6 / 37

Tobler's �rst law of geography Tobler's �rst law of geography "Everything is related to everything else, but near things "Everything is related to everything else, but near things are more related than distant things" are more related than distant things" Tobler (1970) Tobler (1970) 7 / 37 7 / 37

Implications of Tobler's law 8 / 37

Adding smooths Already know that + is our friend Can build a big model... dsm_all <- dsm(count~s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + s(NPP), ddf.obj=df_hr, segment.data=segs, observation.data=obs, family=tw()) 9 / 37

Each s() has its own options s(..., k=...) to adjust basis size s(..., bs="...") for basis type lots more options (we'll see a few here) 10 / 37

Now we have a huge model, what do we do? Now we have a huge model, what do we do? 11 / 37 11 / 37

Term selection Two popular approaches (using -values) 𝑞 Stepwise selection - path dependence All possible subsets - computationally expensive (fishing?) 12 / 37

p-values Test for zero effect of a smooth They are approximate for GAMs (but useful) Reported in summary 13 / 37

summary(dsm_all) ## ## Family: Tweedie(p=1.25) ## Link function: log ## ## Formula: ## count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + ## s(NPP) + offset(off.set) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -20.6368 0.2751 -75 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(x,y) 5.225 7.153 1.233 0.2920 ## s(Depth) 3.568 4.439 6.641 1.82e-05 *** ## s(DistToCAS) 1.000 1.000 1.504 0.2204 ## s(SST) 5.927 6.986 2.068 0.0407 * ## s(EKE) 1.763 2.225 2.579 0.0693 . ## s(NPP) 2.393 3.068 0.856 0.4678 ## --- 14 / 37 ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(dsm_ts_all) Each s() has its own options summary(dsm_all) - PowerPoint PPT Presentation

summary(dsm_ts_all) Each s() has its own options summary(dsm_all) Count model count~... Using reference bands Term selection Model with no shrinkage ... with shrinkage The story so far... p-values Goodness of t Implications of Tobler's

Some Highlights of DSM-5 Jan Fawcett, MD Conflicts of Interest: More Enjoyment Than DSM-5

J ava/ DSM A Plat f orm f or Het erogeneous Comput ing W. Yu, A. Cox Depar t ment of Comput

Applying TDD for Creating DSM solutions: demo Juha-Pekka Tolvanen 30 October, 2016 DSM

Q1-2015 COLORADO May 6, 2015 DSM ROUNDTABLE AGENDA 1:00 1:05 p.m. Welcome and DSM

Colorado DSM Roundtable Colorado DSM Roundtable August 21, 2013 1:00 4:00 pm 1800 Larimer

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

Exotic Options: An Overview Exotic options: Options whose characteristics vary from standard call

1 T T own of Coeymans own of Coeymans T T own of Coeymans own of Coeymans Compr ehensive

1 2 Qualitative impairment in social interaction a)Marked impairment in the use of multiple

DSM Venturing Corporate VC of DS M DS M Venturing Corporate VC of DS M DSM company

P2P, DSM, and Other Products P2P, DSM, and Other Products from the Complexity Factory from the

Next Generation NAS Evolution of DSM Platform Jeremie Francois Synology Deep Machine Learning

proposed draft agricultural zoning Town of Ulysses March 23, 2017 T own of T own of Y T

Neera God God s own Drink from s own Drink from Neera God s own Country

(Literal) World Domination A talk about FOSS games Disclaimer My Own, All my own, Nothing

From STATIC to DYNAMIC Network Paths 1 Network Options 2 Network Options - RSVP: All or

Lab # 2: Genetics of Drosophila scitechdaily.com The Life Cycle of Drosophila Full development:

Generating Images from Captions with Attention Elman Mansimov Emilio Parisotto Jimmy Lei Ba

Accuracies and Biases in Modeling Password Guessability Blase Ur, Sean M. Segreti, Lujo Bauer,

Chapter 1 Our Place in the Universe 1.1 A Modern View of the Universe Our goals for learning:

Extra information - SST What are detection functions? Distance and detectability What do we

Malaysian Healthy Ageing Society Golden Gate Fertility Planning Centre Dr. Khoo Mow Song Ph.D in

Marc A. Marti-Renom Genome Biology Group (CNAG) Structural Genomics Group (CRG) Whale sperm

Hougang Primary School P5 SCIENCE WORKSHOP FOR PARENTS Facilitators Mr Mohamed Yahya Mrs Cindy