Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE - - PowerPoint PPT Presentation

mixing r with other languages
SMART_READER_LITE
LIVE PREVIEW

Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE - - PowerPoint PPT Presentation

Mixing R with other languages JOHN D. COOK, PHD SINGULAR VALUE CONSULTING Why R? Libraries, libraries, libraries De facto standard for statistical research Nice language, as far as statistical languages go Quirky, flawed, and


slide-1
SLIDE 1

Mixing R with

  • ther languages

JOHN D. COOK, PHD SINGULAR VALUE CONSULTING

slide-2
SLIDE 2

Why R?

 Libraries, libraries, libraries  De facto standard for statistical research  Nice language, as far as statistical languages go  “Quirky, flawed, and an enormous success.”

slide-3
SLIDE 3

Why mix languages?

 Improve performance of R code

 Execution speed (e.g. loops)  Memory management

 Raid R’s libraries

slide-4
SLIDE 4

How to optimize R

 Vectorize  Rewrite not using R

slide-5
SLIDE 5

A few R quirks

 Everything is a vector  Everything can be null or NA  Unit-offset vectors  Zero index legal but strange  Negative indices remove elements  Matrices filled by column by default  $ acts like dot, dot not special

slide-6
SLIDE 6

C package interface

 Must manage low-level details of R object model and memory  Requires Rtools on Windows  Lots of macros like REALSXP, PROTECT, and UNPROTECT  Use C++ (Rcpp) instead

“I do not recommend using C for writing new high-performance

  • code. Instead write C++ with Rcpp.” – Hadley Wickham
slide-7
SLIDE 7

Rcpp

 The most widely used extension method for R  Call C, C++, or Fortran from R  Companion project RInside to call R from C++  Extensive support even for advanced C++  Create R packages or inline code  http://rcpp.org  Dirk Eddelbuettel’s book

slide-8
SLIDE 8

Simple Rcpp example

library(Rcpp) cppFunction('int add(int x, int y, int z) { int sum = x + y + z; return sum; }') add(1, 2, 3)

slide-9
SLIDE 9

.NET

 RDCOM

http://sunsite.univie.ac.at/rcom/

 F# type provider for R

http://bluemountaincapital.github.io/FSharpRProvider/

 R.NET

https://rdotnet.codeplex.com/

slide-10
SLIDE 10

SQL Server 2016

execute sp_execute_external_script @language = N'R' , @script = N' OutputDataSet<- data.frame(c("hello"), " ", c("world"));' , @input_data_1 = N' ' WITH RESULT SETS ( ([col1] varchar(20) , [col2] char(1), [col3] varchar(20) ) );

slide-11
SLIDE 11

Haskell

 HaskellR from Tweag.io

http://tweag.github.io/HaskellR/

 Use quasi-quoting into inline R

[r| … |]

 Interactive REPL with H wrapper around GHCi  Works with Jupyter notebooks

slide-12
SLIDE 12

Emacs org-mode

 Crufty but powerful, like all things Emacs  Ships with support for many languages  Works reliably cross-platform  Good for exploration / prototyping  Literate programming

slide-13
SLIDE 13
  • rg-babel languages

Supported Other ABC Dot Ledger Org Screen Axiom Mathematica Asymptote Ebnf Lilypond Perl Sed Elixir Mathomatic Awk Elisp Lisp Picolisp Shell Eukleides MongoDB C Forth Make PlantUML Shen Fomus Neo4j C++ Fortran Matlab Processing SQL Google translate OZ Calc Gnuplot Maxima Python SQLite Groovy Prolog Clojure Haskell Mscgen R Stan HTML Rec Comint Io OCaml Ruby http request SML Coq J Octave Sass iPython Stata CSS Java Scala Julia Tcl D Javascript Scheme Kotlin Typescript Ditaa LaTeX LFE

slide-14
SLIDE 14

Structure of an org-mode file

Text, images, LaTeX equations, etc. #+begin_src R … #+end_src text etc. … #+begin_src python … #+end_src

slide-15
SLIDE 15

Language interop

#+name: sin_r #+begin_src R :var x=0 sin(x) #+end_src #+name: cos_p #+begin_src python :var x=1 import math return math.cos(x) #+end_src #+name: sum_sq #+begin_src perl :var a=3 :var b=4 $a*$a + $b*$b #+end_src #+call: sum_sq(sin_r(1), cos_p(1)) #+results: : 1

slide-16
SLIDE 16

Jupyter notebooks

 Started out as IPython notebooks  Julia + Python + R  Multiple languages supported (separately)  Less transparent than org-babel

 For better: images, formatting, etc.  For worse: Hard to version and diff

slide-17
SLIDE 17

Some languages with Jupyter kernels

Bash F# Julia Prolog C Forth Matlab Python C++ Go Maxima Ruby C# Haskell OCaml SAS Clojure Hy Octave SageMath Coffeescript J PHP Scala Common Lisp Java Perl(6) Tcl Erlang Javascript PowerShell Xonsh

slide-18
SLIDE 18

Beaker notebooks

 A fork of IPython, predecessor to Jupyter  http://beakernotebook.com/  Cells can be written in different languages  Set attribute on beaker object in one language,

access attribute from another language

 R data.frame <-> Python pandas.DataFrame

slide-19
SLIDE 19

Beaker example

beaker.foo = “Hello world” # Python cell x <- beaker::get(‘foo’) # R cell beaker::set(‘answer’, 42) # R cell z = beaker.answer[0] # Python cell

slide-20
SLIDE 20

Languages supported in Beaker notebooks

C++ Java Python(3) Clojure JavaScript R F# Julia Ruby Groovy Lua/Torch Scala/Spark HTML Node SQL

slide-21
SLIDE 21

R Markdown

 Similar to Jupyter, Beaker  http://rmarkdown.rstudio.com  Can mix languages in a single document  Exchange data between languages via data frames  Many publication export formats

slide-22
SLIDE 22

Languages supported in R Markdown

Bash R CSS Rcpp JavaScript SQL Python Stan

slide-23
SLIDE 23

R Markdown example

Text (markdown)… ```{r} x <- “hello from R” print(x) ``` Text … ```{python} x = “ “.join( [“Hello”, “from”, “Python”] ) print(x) ```

slide-24
SLIDE 24

Summary

 Make R more efficient, or borrow its libraries.  R differences: null/NA, vectors, unit offset, etc.  Most of these approaches do not simply install and “just work.”  Org-babel works as documented, but maybe not as expected.  Most general/powerful approach: language <-> Rcpp <-> R

slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27

Contact