Taking R Mainstream in Production Systems Misha Lisovich - - PowerPoint PPT Presentation

taking r mainstream
SMART_READER_LITE
LIVE PREVIEW

Taking R Mainstream in Production Systems Misha Lisovich - - PowerPoint PPT Presentation

Taking R Mainstream in Production Systems Misha Lisovich misha@honestbuildings.com The Question Q: Should I Use R in production? A: Yes! (In a couple of years) The Process 1. Productize - Compelling data products - Innovation pipeline 2.


slide-1
SLIDE 1

Taking R Mainstream

in Production Systems

Misha Lisovich misha@honestbuildings.com

slide-2
SLIDE 2

The Question

Q: Should I Use R in production? A: Yes! (In a couple of years)

slide-3
SLIDE 3

The Process

  • 1. Productize
  • Compelling data products
  • Innovation pipeline
  • 2. Ruggedize
  • Toolchain: Rstudio, Devtools, Github, Travis CI, Docker
  • Strong testing
  • Production-ready Architecture
  • 3. Assimilate
  • Command line tools
  • Make it into HTTP APIs
  • Make it into Docker containers
slide-4
SLIDE 4

Step 1: Productize

Internal Products:

  • Ad-hoc Analyses
  • Internal Dashboards
  • Automated reports
  • Rapid Prototyping

External Products:

  • End-user data products
  • Backend services
slide-5
SLIDE 5
  • 1. Dashboards

Business Intelligence Internal Tools Data & Job Monitoring

slide-6
SLIDE 6
  • 2. Automated Reports

.Rmd -> html

=

slide-7
SLIDE 7
  • 3. Rapid Prototyping
slide-8
SLIDE 8
  • 4. Backend Services

Batch Data Processing (ETL) R APIs

slide-9
SLIDE 9
  • 5. End-user Products
slide-10
SLIDE 10

Step 2: Ruggedize

  • 1. Create reproducible architecture
  • 2. Set up strong testing & CI
  • 3. Separate Production and Dev
  • 4. Set up monitoring & reporting
slide-11
SLIDE 11

Case Study: HB Architecture

  • Rstudio
  • Containerized Architecture
  • Continuous Integration
  • Multiple Environments
  • Notifications/Monitoring
slide-12
SLIDE 12

Data Architecture

elasticsearch: image: elasticsearch shiny-server: image: shiny ports:

  • "443:443"

links:

  • elasticsearch

etl: image:etl volumes:

  • .:/data

etl-data: image: etl-data

ETL Shiny Server Elastic ETL Data SQL S3 Web rAPI SQL Shiny Server Elastic ETL data ETL rAPI

Docker Compose Containers

+ =

Rstudio Server

slide-13
SLIDE 13

Environments

ETL Shiny Server Elastic data volume SQL S3

www.dataproduct.com internal-dashboards.com

ETL Shiny Server Elastic data volume SQL S3

staging-www.dataproduct.com staging-internal-dashboards.com

Production Staging

slide-14
SLIDE 14

Continuous Integration

Github Travis CI

commit latest-stable tag

Production

pull latest-stable

Staging

pull latest-stable

Success!

slide-15
SLIDE 15

Docker Registry/Rolling Back

Docker Registry ETL data volume

Changes Deployed to Prod Save Versioned Image

Danger! Need to Rollback!

ETL data volume

Load Older Image

Docker Registry

slide-16
SLIDE 16

Step 3: Assimilate!

(i.e., be kind to your devs)

slide-17
SLIDE 17

Assimilate (contd)

  • HTTP APIs
  • OpenCPU, rapier
  • Docker containers
  • Rocker
  • Command line tools
  • Rscript, littler, docopt
slide-18
SLIDE 18

Thank you!

misha@honestbuildings.com