Gov 51: Summarizing Bivariate Relationships: Cross-tabs, - - PowerPoint PPT Presentation

gov 51 summarizing bivariate relationships cross tabs
SMART_READER_LITE
LIVE PREVIEW

Gov 51: Summarizing Bivariate Relationships: Cross-tabs, - - PowerPoint PPT Presentation

Gov 51: Summarizing Bivariate Relationships: Cross-tabs, Scatterplots, and Correlation Matthew Blackwell Harvard University 1 / 18 Efgect of assassination attempts 0 Boumedienne 41 -9 ## polityafter interwarbefore ## 1 -6.00 0 ## 2


slide-1
SLIDE 1

Gov 51: Summarizing Bivariate Relationships: Cross-tabs, Scatterplots, and Correlation

Matthew Blackwell

Harvard University

1 / 18

slide-2
SLIDE 2

Efgect of assassination attempts

leaders <- read.csv(”data/leaders.csv”) head(leaders[, 1:7]) ## year country leadername age politybefore ## 1 1929 Afghanistan Habibullah Ghazi 39

  • 6

## 2 1933 Afghanistan Nadir Shah 53

  • 6

## 3 1934 Afghanistan Hashim Khan 50

  • 6

## 4 1924 Albania Zogu 29 ## 5 1931 Albania Zogu 36

  • 9

## 6 1968 Algeria Boumedienne 41

  • 9

## polityafter interwarbefore ## 1

  • 6.00

## 2

  • 7.33

## 3

  • 8.00

## 4

  • 9.00

## 5

  • 9.00

## 6

  • 9.00

2 / 18

slide-3
SLIDE 3

Contingency tables

  • With two categorical variables, we can create contingency tables.
  • Also known as cross-tabs
  • Rows are the values of one variable, columns the other.

table(Before = leaders$civilwarbefore, After = leaders$civilwarafter) ## After ## Before 1 ## 0 177 19 ## 1 27 27

  • Quick summary how the two variables “go together.”

3 / 18

slide-4
SLIDE 4

Contingency tables

  • With two categorical variables, we can create contingency tables.
  • Also known as cross-tabs
  • Rows are the values of one variable, columns the other.

table(Before = leaders$civilwarbefore, After = leaders$civilwarafter) ## After ## Before 1 ## 0 177 19 ## 1 27 27

  • Quick summary how the two variables “go together.”

3 / 18

slide-5
SLIDE 5

Contingency tables

  • With two categorical variables, we can create contingency tables.
  • Also known as cross-tabs
  • Rows are the values of one variable, columns the other.

table(Before = leaders$civilwarbefore, After = leaders$civilwarafter) ## After ## Before 1 ## 0 177 19 ## 1 27 27

  • Quick summary how the two variables “go together.”

3 / 18

slide-6
SLIDE 6

Contingency tables

  • With two categorical variables, we can create contingency tables.
  • Also known as cross-tabs
  • Rows are the values of one variable, columns the other.

table(Before = leaders$civilwarbefore, After = leaders$civilwarafter) ## After ## Before 1 ## 0 177 19 ## 1 27 27

  • Quick summary how the two variables “go together.”

3 / 18

slide-7
SLIDE 7

Contingency tables

  • With two categorical variables, we can create contingency tables.
  • Also known as cross-tabs
  • Rows are the values of one variable, columns the other.

table(Before = leaders$civilwarbefore, After = leaders$civilwarafter) ## After ## Before 1 ## 0 177 19 ## 1 27 27

  • Quick summary how the two variables “go together.”

3 / 18

slide-8
SLIDE 8

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-9
SLIDE 9

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-10
SLIDE 10

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-11
SLIDE 11

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-12
SLIDE 12

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-13
SLIDE 13

Cross-tabs with proportions

  • Use the prop.table() for proportions:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter)) ## After ## Before 1 ## 0 0.708 0.076 ## 1 0.108 0.108

  • We can also ask R to calculate proportions within each row:

prop.table(table(Before = leaders$civilwarbefore, After = leaders$civilwarafter), margin = 1) ## After ## Before 1 ## 0 0.9031 0.0969 ## 1 0.5000 0.5000

4 / 18

slide-14
SLIDE 14

Scatterplot

  • Direct graphical comparison of two continuous variables.

5 / 18

slide-15
SLIDE 15

Scatterplot

  • Direct graphical comparison of two continuous variables.
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 5 / 18

slide-16
SLIDE 16

How to create a scatterplot

  • Each point on the scatterplot (𝘺𝘫, 𝘻𝘫)
  • Use the plot() function

plot(x = leaders$politybefore, y = leaders$polityafter, xlab = ”Democracy Level (Before)”, ylab = ”Democracy Level (After)”, main = ”Democracy Before and After Assassination Attempts”)

6 / 18

slide-17
SLIDE 17

How to create a scatterplot

  • Each point on the scatterplot (𝘺𝘫, 𝘻𝘫)
  • Use the plot() function

plot(x = leaders$politybefore, y = leaders$polityafter, xlab = ”Democracy Level (Before)”, ylab = ”Democracy Level (After)”, main = ”Democracy Before and After Assassination Attempts”)

6 / 18

slide-18
SLIDE 18

How to create a scatterplot

  • Each point on the scatterplot (𝘺𝘫, 𝘻𝘫)
  • Use the plot() function

plot(x = leaders$politybefore, y = leaders$polityafter, xlab = ”Democracy Level (Before)”, ylab = ”Democracy Level (After)”, main = ”Democracy Before and After Assassination Attempts”)

6 / 18

slide-19
SLIDE 19

Scatterplot

leaders[1, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 1

  • 6
  • 6

7 / 18

slide-20
SLIDE 20

Scatterplot

leaders[1, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 1

  • 6
  • 6

7 / 18

slide-21
SLIDE 21

Scatterplot

leaders[1, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 1

  • 6
  • 6
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 7 / 18

slide-22
SLIDE 22

Scatterplot

leaders[1, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 1

  • 6
  • 6
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 8 / 18

slide-23
SLIDE 23

Scatterplot

leaders[2, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 2

  • 6
  • 7.33
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 9 / 18

slide-24
SLIDE 24

Scatterplot

leaders[3, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 3

  • 6
  • 8
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 10 / 18

slide-25
SLIDE 25

Scatterplot

leaders[3, c(”politybefore”, ”polityafter”)] ## politybefore polityafter ## 3

  • 6
  • 8
  • 10
  • 5

5 10

  • 10
  • 5

5 10

Democracy Before and After Assassination Attempts

Democracy Level (Before) Democracy Level (After) 11 / 18

slide-26
SLIDE 26

How big is big?

  • Would be nice to have a standard summary of how similar variables are.
  • Problem: variables on difgerent scales!
  • Need a way to put any variable on common units.
  • z-score to the rescue!

z-score of 𝘺𝘫 =

𝘺𝘫 − mean of 𝘺

standard deviation of 𝘺

  • Crucial property: z-scores don’t depend on units

z-score of (𝘣𝘺𝘫 + 𝘤) = z-score of 𝘺𝘫

12 / 18

slide-27
SLIDE 27

How big is big?

  • Would be nice to have a standard summary of how similar variables are.
  • Problem: variables on difgerent scales!
  • Need a way to put any variable on common units.
  • z-score to the rescue!

z-score of 𝘺𝘫 =

𝘺𝘫 − mean of 𝘺

standard deviation of 𝘺

  • Crucial property: z-scores don’t depend on units

z-score of (𝘣𝘺𝘫 + 𝘤) = z-score of 𝘺𝘫

12 / 18

slide-28
SLIDE 28

How big is big?

  • Would be nice to have a standard summary of how similar variables are.
  • Problem: variables on difgerent scales!
  • Need a way to put any variable on common units.
  • z-score to the rescue!

z-score of 𝘺𝘫 =

𝘺𝘫 − mean of 𝘺

standard deviation of 𝘺

  • Crucial property: z-scores don’t depend on units

z-score of (𝘣𝘺𝘫 + 𝘤) = z-score of 𝘺𝘫

12 / 18

slide-29
SLIDE 29

How big is big?

  • Would be nice to have a standard summary of how similar variables are.
  • Problem: variables on difgerent scales!
  • Need a way to put any variable on common units.
  • z-score to the rescue!

z-score of 𝘺𝘫 =

𝘺𝘫 − mean of 𝘺

standard deviation of 𝘺

  • Crucial property: z-scores don’t depend on units

z-score of (𝘣𝘺𝘫 + 𝘤) = z-score of 𝘺𝘫

12 / 18

slide-30
SLIDE 30

How big is big?

  • Would be nice to have a standard summary of how similar variables are.
  • Problem: variables on difgerent scales!
  • Need a way to put any variable on common units.
  • z-score to the rescue!

z-score of 𝘺𝘫 =

𝘺𝘫 − mean of 𝘺

standard deviation of 𝘺

  • Crucial property: z-scores don’t depend on units

z-score of (𝘣𝘺𝘫 + 𝘤) = z-score of 𝘺𝘫

12 / 18

slide-31
SLIDE 31

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-32
SLIDE 32

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-33
SLIDE 33

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-34
SLIDE 34

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-35
SLIDE 35

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-36
SLIDE 36

Correlation

  • How do variables move together on average?
  • When 𝘺𝘫 is big, what is 𝘻𝘫 likely to be?
  • Positive correlation: when 𝘺𝘫 is big, 𝘻𝘫 is also big
  • Negative correlation: when 𝘺𝘫 is big, 𝘻𝘫 is small
  • High magnitude of correlation: data cluster tightly around a line.
  • The technical defjnition of the correlation coeffjcient:

𝟤 𝘰 − 𝟤

𝘰

𝘫=𝟤

[(z-score for 𝘺𝘫) × (z-score for 𝘻𝘫)]

13 / 18

slide-37
SLIDE 37

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y)

14 / 18

slide-38
SLIDE 38

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (pos. num) = +
  • Small values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (neg. num) = +
  • If these dominate

positive correlation.

15 / 18

slide-39
SLIDE 39

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (pos. num) = +
  • Small values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (neg. num) = +
  • If these dominate

positive correlation.

15 / 18

slide-40
SLIDE 40

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (pos. num) = +
  • Small values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (neg. num) = +
  • If these dominate

positive correlation.

15 / 18

slide-41
SLIDE 41

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (pos. num) = +
  • Small values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (neg. num) = +
  • If these dominate

positive correlation.

15 / 18

slide-42
SLIDE 42

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (pos. num) = +
  • Small values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (neg. num) = +
  • If these dominate ⇝ positive correlation.

15 / 18

slide-43
SLIDE 43

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (neg. num) = −
  • Small values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (pos. num) = −
  • If these dominate

negative correlation.

16 / 18

slide-44
SLIDE 44

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (neg. num) = −
  • Small values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (pos. num) = −
  • If these dominate

negative correlation.

16 / 18

slide-45
SLIDE 45

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (neg. num) = −
  • Small values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (pos. num) = −
  • If these dominate

negative correlation.

16 / 18

slide-46
SLIDE 46

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (neg. num) = −
  • Small values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (pos. num) = −
  • If these dominate

negative correlation.

16 / 18

slide-47
SLIDE 47

Correlation intuition

  • 4
  • 2

2 4

  • 4
  • 2

2 4 x y

mean(X) mean(Y) Y > mean(Y) X > mean(X) Y > mean(Y) X < mean(X) Y < mean(Y) X < mean(X) Y < mean(Y) X > mean(X)

  • Large values of 𝘠 tend to occur with small values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (pos. num.) × (neg. num) = −
  • Small values of 𝘠 tend to occur with large values of 𝘡 :
  • (z-score for 𝘺𝘫) × (z-score for 𝘻𝘫) = (neg. num.) × (pos. num) = −
  • If these dominate ⇝ negative correlation.

16 / 18

slide-48
SLIDE 48

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations

positive associations.

  • Negative correlations

negative associations.

  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-49
SLIDE 49

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations

positive associations.

  • Negative correlations

negative associations.

  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-50
SLIDE 50

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations

positive associations.

  • Negative correlations

negative associations.

  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-51
SLIDE 51

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations

positive associations.

  • Negative correlations

negative associations.

  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-52
SLIDE 52

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations

negative associations.

  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-53
SLIDE 53

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-54
SLIDE 54

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-55
SLIDE 55

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-56
SLIDE 56

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-57
SLIDE 57

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-58
SLIDE 58

Properties of correlation coeffjcient

  • Correlation measures linear association.
  • Interpretation:
  • Correlation is between -1 and 1
  • Correlation of 0 means no linear association.
  • Positive correlations ⇝ positive associations.
  • Negative correlations ⇝ negative associations.
  • Closer to -1 or 1 means stronger association.
  • Order doesn’t matter: cor(x,y) = cor(y,x)
  • Not afgected by changes of scale:
  • cor(x,y) = cor(ax+b, cy+d)
  • Celsius vs. Fahreneheit; dollars vs. pesos; cm vs. in.

17 / 18

slide-59
SLIDE 59

Correlation in R

  • Use the cor() function
  • Missing values: set the use = ”pairwise”

available case analysis cor(leaders$politybefore, leaders$polityafter, use = ”pairwise”) ## [1] 0.828

  • Very highly correlation!

18 / 18

slide-60
SLIDE 60

Correlation in R

  • Use the cor() function
  • Missing values: set the use = ”pairwise” ⇝ available case analysis

cor(leaders$politybefore, leaders$polityafter, use = ”pairwise”) ## [1] 0.828

  • Very highly correlation!

18 / 18

slide-61
SLIDE 61

Correlation in R

  • Use the cor() function
  • Missing values: set the use = ”pairwise” ⇝ available case analysis

cor(leaders$politybefore, leaders$polityafter, use = ”pairwise”) ## [1] 0.828

  • Very highly correlation!

18 / 18

slide-62
SLIDE 62

Correlation in R

  • Use the cor() function
  • Missing values: set the use = ”pairwise” ⇝ available case analysis

cor(leaders$politybefore, leaders$polityafter, use = ”pairwise”) ## [1] 0.828

  • Very highly correlation!

18 / 18