Reproducible Research in Ecology with R: distribution of threatened - - PowerPoint PPT Presentation

reproducible research in ecology with r distribution of
SMART_READER_LITE
LIVE PREVIEW

Reproducible Research in Ecology with R: distribution of threatened - - PowerPoint PPT Presentation

Introduction Applying RR Discussion & conclusion Reproducible Research in Ecology with R: distribution of threatened mammals in Equatorial Guinea Mar a V. Jim enez-Franco ( mvjimenez@um.es ), Chele Mart nez-Mart , Jos


slide-1
SLIDE 1

Introduction Applying RR Discussion & conclusion

Reproducible Research in Ecology with R: distribution of threatened mammals in Equatorial Guinea

Mar´ ıa V. Jim´ enez-Franco (mvjimenez@um.es), Chele Mart´ ınez-Mart´ ı, Jos´ e F. Calvo, Jos´ e A. Palaz´

  • n

Department of Ecology and Hidrology Univesity of Murcia (Spain)

10 July 2013

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-2
SLIDE 2

Introduction Applying RR Discussion & conclusion

1

Introduction

2

Applying RR

3

Discussion & conclusion

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-3
SLIDE 3

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Our problem

Scientific studies:

Announce a result Convince readers that the result is correct

Do scientific studies allow readers to repeat and extend the analytical process? This type of situation sounds familiar to many of us. Don’t you agree?

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-4
SLIDE 4

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

RR: a solution?

What is Reproducible Research (RR)?

The ability to repeat the calculations for analyzing the data and

  • btaining the computational results

Why is RR so important?

Describe the results and provide a clear enough protocol to allow successful repetition and extension of papers Coordinate different researchers Learn a new protocol or analysis (Teaching: students and beginners)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-5
SLIDE 5

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

RR in R

Why is RR so important when we use R? The method involves complex steps:

Preprocessing of the data (standardize them) Building the models to test their efficacy Building of figures/graphics/maps with the main results

A wide range of software tools and packages (often combined in unusual or novel ways) Data sets are often analyzed many times, with modifications to the methods and parameters, until the final results are produced Reproducible electronic document: document which compiles the methods (software components and the precise details of their use) with the results in a standardized form

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-6
SLIDE 6

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

The ecological study aims

  • 1. to estimate the probabilities of occupancy (ψ) and

detection (p) by large size mammals in Ecuatorial Guinea based on ecological and social covariates, and

  • 2. to map species-specific occurrence probability to identify

priority areas for conservation of large mammals.

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-7
SLIDE 7

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Study area: Equatorial Guinea

26 000 km2 rectangular-shaped Rio Muni region We divided the study area into 225, 5x5 km, sample units Defined within the hunting area (21.6% as a final study area)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-8
SLIDE 8

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-9
SLIDE 9

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Study species

Golgen cat Leopard Forest buffalo Mandrill Gorilla Chimpanzee

(Caracal aurata) (Panthera pardus) (Loxodonta cyclotis)

Forest elephant

(Syncerus caffer) (Mandrillus sphinx) (Gorilla gorilla) (Pan troglodytes)

Golgen cat

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-10
SLIDE 10

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Ecological methods: Research Team A

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked Conduct hunter interviews in the 225 sample units Between April 13 and October 16, 2010 To record presence/absence data for the study species

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-11
SLIDE 11

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Ecological methods: Research Team B

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

Landscape characteristics (elevation, ruggedness and forest with 60% above tree cover land use for each 5x5 km sample unit) Human influence (density of human settlements in each 5x5 km site)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-12
SLIDE 12

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Ecological methods: Research Team C

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

Establish single season site occupancy models In order to estimate species occurrence (e.g., number of occupied sites) as functions of site-level covariates Logit link function of U covariates associated with site i

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-13
SLIDE 13

Introduction Applying RR Discussion & conclusion Problems Solution? Our ecological study

Ecological methods: Research Team C

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

Model: logit(ψi) = β0 + β1 × xi + β2 × xi2 + ... + βU × xiU where, β0 is the intercept or constant term and βU reression coefficients for each covariate R package unmarcked(Fiske and Chandler, 2011)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-14
SLIDE 14

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Needs

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

  • 1. A feedback of information among the research teams was needed:

We remade the models including some new species and Exchanged some covariates (forest area instead of river area)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-15
SLIDE 15

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Needs

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

  • 2. Share information:

To estimate the average probability of occupancy for each species in all the study area. To draw the occurrence maps for each species based on site occupancy models with the covariates

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-16
SLIDE 16

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Our study problems

T eam A T eam B T eam C

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models

GRASS

Hunter interviews

unmarked

We applied the model averaging technique to the best occupancy models

  • btained, using the maps with the information of covariates.

ψ = ψM1 × wM1 + ψM2 × wM2 + . . . + ψMn × wMn wMi R package raster (Hijmans and Etten, 2012)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-17
SLIDE 17

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Our study solution We need to document the analytical process in

  • rder to finish the study properly

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-18
SLIDE 18

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Our study solution

T eam A T eam B T eam C

R

eproducible

Sampling Presence/ absence specie data GIS from data and enviromental maps Site

  • ccupancy

models Covariate maps Probability occupancy model maps Model Averaging Technique

GRASS

Hunter interviews

unmarked raster

R

esearch

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-19
SLIDE 19

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Our study solution

We used RR by applying markdown language and R package knitr to: Calculate and include the information of the R code of statistical analyses and spatial data Include explanation of the analyses

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-20
SLIDE 20

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Our reproducible research electronic document

This document describes the theory and used procedures It has been made in markdown language using the R package kntir Our document index:

Abstract Introduction Study area Data and information (data and funtions ad hoc) Values of probability of occupancy (for each species)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-21
SLIDE 21

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Functions for simple writer: example Data and information

How to calculate the probability of occupancy (ψ) through the model averaging Maps with the covariates for the models Functions

Function modAve Function proceso Function mplot

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-22
SLIDE 22

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Simple writing: example Values of probability of occupancy (ψ)

Golden cat (GC) Leopard (L) Elephant (E) Buffalo (B) Gorilla (G) Chimpanzee (CH) Mandrill (M)

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-23
SLIDE 23

Introduction Applying RR Discussion & conclusion Our procedimental problems Our solution Results of RR in our study

Simple reading: example

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-24
SLIDE 24

Introduction Applying RR Discussion & conclusion Discussion Conclusions

Principles

A good beginning makes a good end Put your best foot forward

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-25
SLIDE 25

Introduction Applying RR Discussion & conclusion Discussion Conclusions

Work options

L

AT

EX language using Sweave (Leisch, 2002) is an useful tool to make automatic documents. Advantages of markdown:

Allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML) Useful for beginners: eg, this presentation Used in different programs: bash (GRASS), R, awk, python, perl, . . .

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-26
SLIDE 26

Introduction Applying RR Discussion & conclusion Discussion Conclusions

RR is important to

Coordinate different research teams. Homogenize the analytical process for an easy use in future applications of this and similar studies. Obtain reproducible electronic document for a better comprehension

  • f the analyses.

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-27
SLIDE 27

Introduction Applying RR Discussion & conclusion Discussion Conclusions

The reproducible electronic document

Can be read and understood easily after a long period of time by the same authors. Can be reused or modified for other similar studies. Therefore, it is useful for other researchers. Useful tool that facilitates the learning and the work in the R

  • proceedings. This process of compiling the methods in a document

could be applied not only for ecologists and researchers of other scientific areas but also for beginners and students in their degrees and masters. The realization of the first document could take some time. Markdown language is very suitable and a straightforward way to make this document.

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-28
SLIDE 28

Introduction Applying RR Discussion & conclusion

Acknowledgements

Panthera and Conservation International for funding and supporting the field work in Equatorial Guinea.

  • A. Mang for assistance in field work and local hunters for

collaborating with the interviews.

  • A. Royle for the support in the realization of Site Occupancy Models.
  • M. V. Jim´

enez-Franco is supported by a FPU grant from the Spanish Ministerio de Educaci´

  • n y Ciencia (reference AP2009-2073).

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-29
SLIDE 29

Introduction Applying RR Discussion & conclusion

References

Fiske, I., Chandler, R., 2011. Unmarked: An R package for fitting hierarchical models of wildlife occurrence and abundance. J. Stat. Softw. 43(10), 1–23. Hijmans, R.J., Van Etten, J. (2012). Geographic analysis and modeling with raster data. URL http://cran.r-project.org/web/packages/raster/raster.pdf. Mart´ ınez-Mart´ ı, C. (2011). The leopard (Panthera pardus) and the golden cat (Caracal aurata) in Equatorial Guinea: A national assessment of status, distribution and threat. Annual report submitted to Panthera/Conservation International. Friedrich Leisch. Sweave: Dynamic generation of statistical reports using literate data analysis. In Wolfgang H¨ ardle and Bernd R¨

  • nz, editors, Compstat 2002 -

Proceedings in Computational Statistics, pages 575-580. Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9. R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.Rproject.org/ Xie, Y. (2013). knitr: A general-purpose package for dynamic report generation in R . R package version 1.1, URL http://yihui.name/knitr

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia

slide-30
SLIDE 30

Introduction Applying RR Discussion & conclusion

Thanks! Some questions?

Mar´ ıa V. Jim´ enez-Franco et al Reproducible Research in Ecology with R: . . . Univ Murcia