What we wish people knew more about when working with R Peter - - PowerPoint PPT Presentation

what we wish people knew more about when working with r
SMART_READER_LITE
LIVE PREVIEW

What we wish people knew more about when working with R Peter - - PowerPoint PPT Presentation

Faculty of Health Sciences What we wish people knew more about when working with R Peter Dalgaard Dept. of Biostatistics University of Copenhagen Background R has entered the mainstream, and a great many research projects in statistics


slide-1
SLIDE 1

Faculty of Health Sciences

What we wish people knew more about when working with R

Peter Dalgaard

  • Dept. of Biostatistics

University of Copenhagen

slide-2
SLIDE 2

Background

◮ R has entered the mainstream, and a great many research

projects in statistics now involve R programming or the writing of R packages

◮ Young researchers will typically need to be taught about

relatively advanced aspects of R

◮ Consider planning, say, an advanced course on R programming ◮ Much will be pretty straightforward ◮ Not necessarily easy, but you know that you need to take the

students from A to B along a path with certain twist and turns and stumbling stones

2 / 19

slide-3
SLIDE 3

Background

◮ R has entered the mainstream, and a great many research

projects in statistics now involve R programming or the writing of R packages

◮ Young researchers will typically need to be taught about

relatively advanced aspects of R

◮ Consider planning, say, an advanced course on R programming ◮ Much will be pretty straightforward ◮ Not necessarily easy, but you know that you need to take the

students from A to B along a path with certain twist and turns and stumbling stones

2 / 19

slide-4
SLIDE 4

Background

◮ R has entered the mainstream, and a great many research

projects in statistics now involve R programming or the writing of R packages

◮ Young researchers will typically need to be taught about

relatively advanced aspects of R

◮ Consider planning, say, an advanced course on R programming ◮ Much will be pretty straightforward ◮ Not necessarily easy, but you know that you need to take the

students from A to B along a path with certain twist and turns and stumbling stones

2 / 19

slide-5
SLIDE 5

Background

◮ R has entered the mainstream, and a great many research

projects in statistics now involve R programming or the writing of R packages

◮ Young researchers will typically need to be taught about

relatively advanced aspects of R

◮ Consider planning, say, an advanced course on R programming ◮ Much will be pretty straightforward ◮ Not necessarily easy, but you know that you need to take the

students from A to B along a path with certain twist and turns and stumbling stones

2 / 19

slide-6
SLIDE 6

Background

◮ R has entered the mainstream, and a great many research

projects in statistics now involve R programming or the writing of R packages

◮ Young researchers will typically need to be taught about

relatively advanced aspects of R

◮ Consider planning, say, an advanced course on R programming ◮ Much will be pretty straightforward ◮ Not necessarily easy, but you know that you need to take the

students from A to B along a path with certain twist and turns and stumbling stones

2 / 19

slide-7
SLIDE 7

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-8
SLIDE 8

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-9
SLIDE 9

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-10
SLIDE 10

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-11
SLIDE 11

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-12
SLIDE 12

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-13
SLIDE 13

The blank stare

◮ At some points, however, you find yourself facing a wall of

ignorance

◮ There are things students just don’t know the first thing about ◮ Say, you want to show how to speed up a slow piece of R code ◮ So you explain that they should rewrite parts of the code in C,

compile it, and link it dynamically

◮ What is C? ◮ What is a compiler? ◮ What is linking? 3 / 19

slide-14
SLIDE 14

Generic problem

◮ In order to explain Z, I must first tell them about Y, but that

won’t make sense to them because they never heard of X, etc.

◮ This is getting worse! A generic trend in computing is that

more and more functionality gets hidden away.

◮ In some senses, this may be a good trend, making computers

accessible by more people

◮ However, from a scientific point of view, it makes it harder to

understand what is going on inside a computer

◮ (Car analogy: Making cars simpler and safer to operate does

not make better car engineers)

4 / 19

slide-15
SLIDE 15

Generic problem

◮ In order to explain Z, I must first tell them about Y, but that

won’t make sense to them because they never heard of X, etc.

◮ This is getting worse! A generic trend in computing is that

more and more functionality gets hidden away.

◮ In some senses, this may be a good trend, making computers

accessible by more people

◮ However, from a scientific point of view, it makes it harder to

understand what is going on inside a computer

◮ (Car analogy: Making cars simpler and safer to operate does

not make better car engineers)

4 / 19

slide-16
SLIDE 16

Generic problem

◮ In order to explain Z, I must first tell them about Y, but that

won’t make sense to them because they never heard of X, etc.

◮ This is getting worse! A generic trend in computing is that

more and more functionality gets hidden away.

◮ In some senses, this may be a good trend, making computers

accessible by more people

◮ However, from a scientific point of view, it makes it harder to

understand what is going on inside a computer

◮ (Car analogy: Making cars simpler and safer to operate does

not make better car engineers)

4 / 19

slide-17
SLIDE 17

Generic problem

◮ In order to explain Z, I must first tell them about Y, but that

won’t make sense to them because they never heard of X, etc.

◮ This is getting worse! A generic trend in computing is that

more and more functionality gets hidden away.

◮ In some senses, this may be a good trend, making computers

accessible by more people

◮ However, from a scientific point of view, it makes it harder to

understand what is going on inside a computer

◮ (Car analogy: Making cars simpler and safer to operate does

not make better car engineers)

4 / 19

slide-18
SLIDE 18

Generic problem

◮ In order to explain Z, I must first tell them about Y, but that

won’t make sense to them because they never heard of X, etc.

◮ This is getting worse! A generic trend in computing is that

more and more functionality gets hidden away.

◮ In some senses, this may be a good trend, making computers

accessible by more people

◮ However, from a scientific point of view, it makes it harder to

understand what is going on inside a computer

◮ (Car analogy: Making cars simpler and safer to operate does

not make better car engineers)

4 / 19

slide-19
SLIDE 19

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-20
SLIDE 20

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-21
SLIDE 21

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-22
SLIDE 22

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-23
SLIDE 23

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-24
SLIDE 24

How do we know what we know?

◮ Is education deteriorating? ◮ Not really. If we look back, people who were into statistical

computing were often not formally educated.

◮ Some people had switched from Computer Science to

Statistics

◮ Others came out of the "Commodore 64" generation (typically

teenagers from the 80s and 90s)

◮ At about the time R took off, there was the IT explosion and

the whole Unix/Linux/Open Source culture around the turn of the millennium

◮ We are now moving from a relatively tight-knit subculture to

a position in the mainstream, and this requires new thinking

5 / 19

slide-25
SLIDE 25

Example: Parse trees

exp(−x^2/2) exp / − 2 ^ x 2

◮ In math, people know operator precedence intuitively ◮ However, they may not always realize that there is a

well-defined process (parsing) leading from one representation to the other

◮ Or, that this in R is represented as an object which forms the

basis of the later evaluation

6 / 19

slide-26
SLIDE 26

Example: Parse trees

exp(−x^2/2) exp / − 2 ^ x 2

◮ In math, people know operator precedence intuitively ◮ However, they may not always realize that there is a

well-defined process (parsing) leading from one representation to the other

◮ Or, that this in R is represented as an object which forms the

basis of the later evaluation

6 / 19

slide-27
SLIDE 27

Example: Parse trees

exp(−x^2/2) exp / − 2 ^ x 2

◮ In math, people know operator precedence intuitively ◮ However, they may not always realize that there is a

well-defined process (parsing) leading from one representation to the other

◮ Or, that this in R is represented as an object which forms the

basis of the later evaluation

6 / 19

slide-28
SLIDE 28

Example: Parse trees

exp(−x^2/2) exp / − 2 ^ x 2

◮ In math, people know operator precedence intuitively ◮ However, they may not always realize that there is a

well-defined process (parsing) leading from one representation to the other

◮ Or, that this in R is represented as an object which forms the

basis of the later evaluation

6 / 19

slide-29
SLIDE 29

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-30
SLIDE 30

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-31
SLIDE 31

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-32
SLIDE 32

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-33
SLIDE 33

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-34
SLIDE 34

How did I know about parsing?

◮ Mixture of many sources ◮ Back pages of “Pascal User Manual and Report”: recursive

descent parser

◮ PL/0 parser in Wirth: “Algorithms + Data Stuctures =

Programs”. This was not actually in the curriculum, but I rubbed shoulders with 3rd yr CS students

◮ Exposure to Genstat, BMDP (ca. 1980) ◮ Aho & Ullman’s “Dragon book” taught me about LALR(1)

grammars

◮ HP-UX series 300 computer on a project with som eye

  • doctors. This contained YACC – “Yet Another

Compiler-Compiler”

7 / 19

slide-35
SLIDE 35

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-36
SLIDE 36

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-37
SLIDE 37

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-38
SLIDE 38

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-39
SLIDE 39

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-40
SLIDE 40

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-41
SLIDE 41

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-42
SLIDE 42

A catalogue of ignorance

◮ Parsing ◮ Interfacing to C ◮ Floating point issues ◮ Computational linear algebra ◮ Finer points in computer languages ◮ Obvious pitfall: Trying to explain in a 40 minute talk what I

claim requires a significant chunk of a largish course

◮ Pitfall no. 2: The grumpy old man. . . ◮ Pitfall no. 3: Displaying my own ignorance

8 / 19

slide-43
SLIDE 43

Parsing

◮ Internal structure of expressions, code ◮ Needed in plotmath, model formulas ◮ Names and syntactical names ◮ Tokenizer, lexical analysis, (regular expressions) ◮ Properties of computer syntax: One-step lookahead, R’s

newline anomaly

9 / 19

slide-44
SLIDE 44

Parsing

◮ Internal structure of expressions, code ◮ Needed in plotmath, model formulas ◮ Names and syntactical names ◮ Tokenizer, lexical analysis, (regular expressions) ◮ Properties of computer syntax: One-step lookahead, R’s

newline anomaly

9 / 19

slide-45
SLIDE 45

Parsing

◮ Internal structure of expressions, code ◮ Needed in plotmath, model formulas ◮ Names and syntactical names ◮ Tokenizer, lexical analysis, (regular expressions) ◮ Properties of computer syntax: One-step lookahead, R’s

newline anomaly

9 / 19

slide-46
SLIDE 46

Parsing

◮ Internal structure of expressions, code ◮ Needed in plotmath, model formulas ◮ Names and syntactical names ◮ Tokenizer, lexical analysis, (regular expressions) ◮ Properties of computer syntax: One-step lookahead, R’s

newline anomaly

9 / 19

slide-47
SLIDE 47

Parsing

◮ Internal structure of expressions, code ◮ Needed in plotmath, model formulas ◮ Names and syntactical names ◮ Tokenizer, lexical analysis, (regular expressions) ◮ Properties of computer syntax: One-step lookahead, R’s

newline anomaly

9 / 19

slide-48
SLIDE 48

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-49
SLIDE 49

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-50
SLIDE 50

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-51
SLIDE 51

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-52
SLIDE 52

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-53
SLIDE 53

Floating-point issues

◮ Limits of accuracy, decimals not representable in binary ◮ (FAQ 7.31...) ◮ Deeper issue: knowledge of bit-level storage and hardware ◮ IEEE standards ◮ FP exceptions ◮ Loss of fine control caused by optimizers reordering code

10 / 19

slide-54
SLIDE 54

C, Fortran

◮ Structure of compiled languages ◮ Modular programs, linking,.libraries ◮ The C preprocessor ◮ Calling conventions

11 / 19

slide-55
SLIDE 55

C, Fortran

◮ Structure of compiled languages ◮ Modular programs, linking,.libraries ◮ The C preprocessor ◮ Calling conventions

11 / 19

slide-56
SLIDE 56

C, Fortran

◮ Structure of compiled languages ◮ Modular programs, linking,.libraries ◮ The C preprocessor ◮ Calling conventions

11 / 19

slide-57
SLIDE 57

C, Fortran

◮ Structure of compiled languages ◮ Modular programs, linking,.libraries ◮ The C preprocessor ◮ Calling conventions

11 / 19

slide-58
SLIDE 58

Interfaces to C and Fortran

◮ Access macros ◮ Some level of knowledge about the evaluator and internal

storage of code

◮ Classical LISP implementation CAR/CDR/CONS ◮ Garbage collection and PROTECT ◮ The “tree” of objects that do not need protection

12 / 19

slide-59
SLIDE 59

Interfaces to C and Fortran

◮ Access macros ◮ Some level of knowledge about the evaluator and internal

storage of code

◮ Classical LISP implementation CAR/CDR/CONS ◮ Garbage collection and PROTECT ◮ The “tree” of objects that do not need protection

12 / 19

slide-60
SLIDE 60

Interfaces to C and Fortran

◮ Access macros ◮ Some level of knowledge about the evaluator and internal

storage of code

◮ Classical LISP implementation CAR/CDR/CONS ◮ Garbage collection and PROTECT ◮ The “tree” of objects that do not need protection

12 / 19

slide-61
SLIDE 61

Interfaces to C and Fortran

◮ Access macros ◮ Some level of knowledge about the evaluator and internal

storage of code

◮ Classical LISP implementation CAR/CDR/CONS ◮ Garbage collection and PROTECT ◮ The “tree” of objects that do not need protection

12 / 19

slide-62
SLIDE 62

Interfaces to C and Fortran

◮ Access macros ◮ Some level of knowledge about the evaluator and internal

storage of code

◮ Classical LISP implementation CAR/CDR/CONS ◮ Garbage collection and PROTECT ◮ The “tree” of objects that do not need protection

12 / 19

slide-63
SLIDE 63

Algorithms and numerics

◮ Error sensitivity, e.g. SVD vs (X ′X)−1 ◮ Computational complexity ◮ Memory consumption ◮ BLAS issues, CPU architecture

13 / 19

slide-64
SLIDE 64

Algorithms and numerics

◮ Error sensitivity, e.g. SVD vs (X ′X)−1 ◮ Computational complexity ◮ Memory consumption ◮ BLAS issues, CPU architecture

13 / 19

slide-65
SLIDE 65

Algorithms and numerics

◮ Error sensitivity, e.g. SVD vs (X ′X)−1 ◮ Computational complexity ◮ Memory consumption ◮ BLAS issues, CPU architecture

13 / 19

slide-66
SLIDE 66

Algorithms and numerics

◮ Error sensitivity, e.g. SVD vs (X ′X)−1 ◮ Computational complexity ◮ Memory consumption ◮ BLAS issues, CPU architecture

13 / 19

slide-67
SLIDE 67

Markup languages

◮ Need it for Rd format files ◮ HTML, LaTeX, XML ◮ General idea that text is a computable quantity ◮ . . . and that higher-level structure is beneficial

14 / 19

slide-68
SLIDE 68

Markup languages

◮ Need it for Rd format files ◮ HTML, LaTeX, XML ◮ General idea that text is a computable quantity ◮ . . . and that higher-level structure is beneficial

14 / 19

slide-69
SLIDE 69

Markup languages

◮ Need it for Rd format files ◮ HTML, LaTeX, XML ◮ General idea that text is a computable quantity ◮ . . . and that higher-level structure is beneficial

14 / 19

slide-70
SLIDE 70

Markup languages

◮ Need it for Rd format files ◮ HTML, LaTeX, XML ◮ General idea that text is a computable quantity ◮ . . . and that higher-level structure is beneficial

14 / 19

slide-71
SLIDE 71

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-72
SLIDE 72

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-73
SLIDE 73

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-74
SLIDE 74

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-75
SLIDE 75

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-76
SLIDE 76

Programming language taxonomy

◮ (“Lots of quaintly named little languages”) ◮ Compiled vs. interpreted languages ◮ Late and early binding ◮ OOP concepts ◮ Lazy evaluation ◮ A better theoretical overview should help explaining why R

sometimes behaves “strangely”

15 / 19

slide-77
SLIDE 77

R behaving badly

x <- 8 ll <- BinomialLikelihood(x, 20) x <- 2 curve(ll) x <- 15 curve(ll)

With an unfortunate coding of BinomialLikelihood, this gives the curve for BinomialLikelihood(2, 20) twice!

16 / 19

slide-78
SLIDE 78

R behaving badly

x <- 8 ll <- BinomialLikelihood(x, 20) x <- 2 curve(ll) x <- 15 curve(ll)

With an unfortunate coding of BinomialLikelihood, this gives the curve for BinomialLikelihood(2, 20) twice!

16 / 19

slide-79
SLIDE 79

Toolchains

◮ A group of problems relates to lack of knowledge about basic

programs in the OS (or in Rtools)

◮ Compiler, linker, libraries ◮ (And how to install them when they are not there) ◮ Makefiles ◮ Scripts (Perl, shell)

17 / 19

slide-80
SLIDE 80

Toolchains

◮ A group of problems relates to lack of knowledge about basic

programs in the OS (or in Rtools)

◮ Compiler, linker, libraries ◮ (And how to install them when they are not there) ◮ Makefiles ◮ Scripts (Perl, shell)

17 / 19

slide-81
SLIDE 81

Toolchains

◮ A group of problems relates to lack of knowledge about basic

programs in the OS (or in Rtools)

◮ Compiler, linker, libraries ◮ (And how to install them when they are not there) ◮ Makefiles ◮ Scripts (Perl, shell)

17 / 19

slide-82
SLIDE 82

Toolchains

◮ A group of problems relates to lack of knowledge about basic

programs in the OS (or in Rtools)

◮ Compiler, linker, libraries ◮ (And how to install them when they are not there) ◮ Makefiles ◮ Scripts (Perl, shell)

17 / 19

slide-83
SLIDE 83

Toolchains

◮ A group of problems relates to lack of knowledge about basic

programs in the OS (or in Rtools)

◮ Compiler, linker, libraries ◮ (And how to install them when they are not there) ◮ Makefiles ◮ Scripts (Perl, shell)

17 / 19

slide-84
SLIDE 84

So what to do about it?

◮ We cannot reasonably stuff a major part of theoretical

computer science into a stat/maths curriculum

◮ Project-based studying lets students satisfy their own needs,

but it has the same issue as teaching: The sudden need for a large amount of knowledge in s short time

◮ It may well be the case that we need to rethink topics as part

  • f a somewhat longer story, e.g. text processing, then lexical

analysis, then parsing, then CAR et al.

◮ However, some topics, e.g. C programming, are quite clearly

delineated and there is probably no way around teaching them as an independent (sub-)course

18 / 19

slide-85
SLIDE 85

So what to do about it?

◮ We cannot reasonably stuff a major part of theoretical

computer science into a stat/maths curriculum

◮ Project-based studying lets students satisfy their own needs,

but it has the same issue as teaching: The sudden need for a large amount of knowledge in s short time

◮ It may well be the case that we need to rethink topics as part

  • f a somewhat longer story, e.g. text processing, then lexical

analysis, then parsing, then CAR et al.

◮ However, some topics, e.g. C programming, are quite clearly

delineated and there is probably no way around teaching them as an independent (sub-)course

18 / 19

slide-86
SLIDE 86

So what to do about it?

◮ We cannot reasonably stuff a major part of theoretical

computer science into a stat/maths curriculum

◮ Project-based studying lets students satisfy their own needs,

but it has the same issue as teaching: The sudden need for a large amount of knowledge in s short time

◮ It may well be the case that we need to rethink topics as part

  • f a somewhat longer story, e.g. text processing, then lexical

analysis, then parsing, then CAR et al.

◮ However, some topics, e.g. C programming, are quite clearly

delineated and there is probably no way around teaching them as an independent (sub-)course

18 / 19

slide-87
SLIDE 87

So what to do about it?

◮ We cannot reasonably stuff a major part of theoretical

computer science into a stat/maths curriculum

◮ Project-based studying lets students satisfy their own needs,

but it has the same issue as teaching: The sudden need for a large amount of knowledge in s short time

◮ It may well be the case that we need to rethink topics as part

  • f a somewhat longer story, e.g. text processing, then lexical

analysis, then parsing, then CAR et al.

◮ However, some topics, e.g. C programming, are quite clearly

delineated and there is probably no way around teaching them as an independent (sub-)course

18 / 19

slide-88
SLIDE 88

Summmary

◮ R came out of a “historical coincidence” where a number of

people turned out to have both similar and complementary abilities, in areas that were not actually being taught in any systematic fashion

◮ The challenge at this point in time is to formalize and

systematize these abilities in a way that can be taught at a general level

◮ Doing so is essential for the continued development of R and

statistical computing in general

19 / 19

slide-89
SLIDE 89

Summmary

◮ R came out of a “historical coincidence” where a number of

people turned out to have both similar and complementary abilities, in areas that were not actually being taught in any systematic fashion

◮ The challenge at this point in time is to formalize and

systematize these abilities in a way that can be taught at a general level

◮ Doing so is essential for the continued development of R and

statistical computing in general

19 / 19

slide-90
SLIDE 90

Summmary

◮ R came out of a “historical coincidence” where a number of

people turned out to have both similar and complementary abilities, in areas that were not actually being taught in any systematic fashion

◮ The challenge at this point in time is to formalize and

systematize these abilities in a way that can be taught at a general level

◮ Doing so is essential for the continued development of R and

statistical computing in general

19 / 19