Hypotheses with two variates Paired data R.W. Oldford Common - - PowerPoint PPT Presentation
Hypotheses with two variates Paired data R.W. Oldford Common - - PowerPoint PPT Presentation
Hypotheses with two variates Paired data R.W. Oldford Common hypotheses Recall some common circumstances and hypotheses: Hypotheses about the distribution of a single random variable Y for which we have sample values y 1 , y 2 , . . . y n .
Common hypotheses
Recall some common circumstances and hypotheses:
◮ Hypotheses about the distribution of a single random variable Y for which
we have sample values y1, y2, . . . yn. Common hypotheses include:
◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)
distribution FY (y)
Common hypotheses
Recall some common circumstances and hypotheses:
◮ Hypotheses about the distribution of a single random variable Y for which
we have sample values y1, y2, . . . yn. Common hypotheses include:
◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)
distribution FY (y)
◮ Hypotheses about the similarity of the distributions of two random variables
X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:
◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share
a specified distribution family, like N(µ, σ2), parameterized by these features.)
◮ H0 : FY (y) = FX(x) without specifying either F()
Common hypotheses
Recall some common circumstances and hypotheses:
◮ Hypotheses about the distribution of a single random variable Y for which
we have sample values y1, y2, . . . yn. Common hypotheses include:
◮ H0 : µ = E(Y ) = µ0 for some specified value µ0. or ◮ H0 : Y ∼ FY (y) for some specified (possibly only up to some point)
distribution FY (y)
◮ Hypotheses about the similarity of the distributions of two random variables
X and Y for which we have a sample from each, namely x1, x2, . . . , xm and y1, y2, . . . yn. Common hypotheses include:
◮ H0 : µY = µX ; the distributions match on some key features. (e.g. they share
a specified distribution family, like N(µ, σ2), parameterized by these features.)
◮ H0 : FY (y) = FX(x) without specifying either F()
◮ Hypotheses about the joint distribution of X and Y for which we have
sample pairs of values (x1, y1), (x2, y2), . . . , (xn, yn). Common hypotheses include:
◮ H0 : ρXY = 0; that is, X and Y are uncorrelated ◮ H0 : FX,Y (x, y) = FX (x) × FY (y); that is, X and Y are independent (written
X⊥ ⊥Y ).
Common hypotheses - (X, Y ) pairs
Suppose we have a bivariate sample of paired data: (x1, y1), (x2, y2), . . . , (xn, yn).
Common hypotheses - (X, Y ) pairs
Suppose we have a bivariate sample of paired data: (x1, y1), (x2, y2), . . . , (xn, yn). We model these as each being independent realizations of some pair of random variables, (X, Y ), having some joint distribution FX,Y (x, y).
Common hypotheses - (X, Y ) pairs
Suppose we have a bivariate sample of paired data: (x1, y1), (x2, y2), . . . , (xn, yn). We model these as each being independent realizations of some pair of random variables, (X, Y ), having some joint distribution FX,Y (x, y). For example, consider the following dataset on counts of horseshoe crabs found at various U.S. beaches in 2011 and 2012.
horseshoe <- read.csv(path_concat(dataDirectory, "horseshoeCrabs.csv")) knitr::kable(head(horseshoe)) Beach Crabs Year Bennetts Pier 35282 2011 Big Stone 359350 2011 Broadkill 45705 2011 Cape Henlopen 49005 2011 Fortescue 68978 2011 Fowler 8700 2011
The pairs are (xi, yi) where xi is the 2011 horseshoe crab count at beach i, and yi is the 2012 count at the same beach i.
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years.
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated?
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated? ◮ Are the counts independent year to year?
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated? ◮ Are the counts independent year to year? ◮ Can we predict the counts of one year from the counts of the previous year?
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated? ◮ Are the counts independent year to year? ◮ Can we predict the counts of one year from the counts of the previous year? ◮ Do the counts have the same distribution from one year to the next?
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated? ◮ Are the counts independent year to year? ◮ Can we predict the counts of one year from the counts of the previous year? ◮ Do the counts have the same distribution from one year to the next?
The horseshoe crab counts are unusual pairs in the sense that both x and y are measuring the same thing, namely horseshoe crab counts on the same beaches, just for different years.
Horseshoe crab counts - some hypotheses.
Interest lies in understanding the relationship between the counts in the two successive years. We might be interested in a number of questions. For example,
◮ Are the counts correlated? ◮ Are the counts independent year to year? ◮ Can we predict the counts of one year from the counts of the previous year? ◮ Do the counts have the same distribution from one year to the next?
The horseshoe crab counts are unusual pairs in the sense that both x and y are measuring the same thing, namely horseshoe crab counts on the same beaches, just for different years. Determining whether X and Y have the same distribution (or distributional features) are rarely going the be of interest otherwise.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before. Doing so, only the marginal distributions are being compared. No use, or even recognition, of the pairing is being made.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before. Doing so, only the marginal distributions are being compared. No use, or even recognition, of the pairing is being made. This is not generally what we want with paired data. Instead, we imagine the pairing has resulted in two different measurements of the same quantity on the same unit, but under different circumstances.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before. Doing so, only the marginal distributions are being compared. No use, or even recognition, of the pairing is being made. This is not generally what we want with paired data. Instead, we imagine the pairing has resulted in two different measurements of the same quantity on the same unit, but under different circumstances. For example, the two measurements might be the values of some response (e.g. systolic blood pressure, pulse, speed, yield of grain, etc.) taken on the same unit before and after some treatment or other change.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before. Doing so, only the marginal distributions are being compared. No use, or even recognition, of the pairing is being made. This is not generally what we want with paired data. Instead, we imagine the pairing has resulted in two different measurements of the same quantity on the same unit, but under different circumstances. For example, the two measurements might be the values of some response (e.g. systolic blood pressure, pulse, speed, yield of grain, etc.) taken on the same unit before and after some treatment or other change. Because of the inherent differences between units, this matching of measurements on the same unit should make detection of any effect of treatment
- n response easier by removing the unit to unit variability in the comparison.
Paired data - the same distribution?
If the hypothesis of interest is H0 : FX(x) = FY (y), then we might proceed as before. Doing so, only the marginal distributions are being compared. No use, or even recognition, of the pairing is being made. This is not generally what we want with paired data. Instead, we imagine the pairing has resulted in two different measurements of the same quantity on the same unit, but under different circumstances. For example, the two measurements might be the values of some response (e.g. systolic blood pressure, pulse, speed, yield of grain, etc.) taken on the same unit before and after some treatment or other change. Because of the inherent differences between units, this matching of measurements on the same unit should make detection of any effect of treatment
- n response easier by removing the unit to unit variability in the comparison.
The pairing (xi, yi) on the same unit i, suggests that differences between the xis and the yis might be better studying the differences based on each pair, namely based on zi = xi − yi ∀ i = 1, . . . , n. Under the null hypothesis of no difference, we might expect the distribution of Z to be symmetric about 0.
Horseshoe crab counts - the same distribution?
For the horseshoe crab counts, the units are the beaches. The thinking is that comparing the individual distributions will not be as powerful as looking at the differences.
Horseshoe crab counts - the same distribution?
For the horseshoe crab counts, the units are the beaches. The thinking is that comparing the individual distributions will not be as powerful as looking at the differences. For example, just looking at the histograms of counts and of their paired differences suggests that this might indeed be the case.
Horseshoe crab counts - the same distribution?
For the horseshoe crab counts, the units are the beaches. The thinking is that comparing the individual distributions will not be as powerful as looking at the differences. For example, just looking at the histograms of counts and of their paired differences suggests that this might indeed be the case.
2011
Crab count Frequency 5 10 15 0e+00 1e+05 2e+05 3e+05
2012
Crab count Frequency 5 10 15 0e+00 1e+05 2e+05 3e+05
Horseshoe crab counts - the same distribution?
For the horseshoe crab counts, the units are the beaches. The thinking is that comparing the individual distributions will not be as powerful as looking at the differences. For example, just looking at the histograms of counts and of their paired differences suggests that this might indeed be the case.
2011
Crab count Frequency 5 10 15 0e+00 1e+05 2e+05 3e+05
2012
Crab count Frequency 5 10 15 0e+00 1e+05 2e+05 3e+05
2011 − 2012
Differences in crab count Frequency 5 10 15 −1e+05 0e+00 1e+05 2e+05 3e+05
Horseshoe crab counts - paired t-test
The advantage pairing can be seen in testing the hypothesis H0 : E(X) = E(Y ) via a t-test.
Horseshoe crab counts - paired t-test
The advantage pairing can be seen in testing the hypothesis H0 : E(X) = E(Y ) via a t-test. First unpaired: t.test(Crabs ~ Year, data = horseshoe)$p.value ## [1] 0.1124666
Horseshoe crab counts - paired t-test
The advantage pairing can be seen in testing the hypothesis H0 : E(X) = E(Y ) via a t-test. First unpaired: t.test(Crabs ~ Year, data = horseshoe)$p.value ## [1] 0.1124666 Then paired: t.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.04528869
Horseshoe crab counts - paired t-test
The advantage pairing can be seen in testing the hypothesis H0 : E(X) = E(Y ) via a t-test. First unpaired: t.test(Crabs ~ Year, data = horseshoe)$p.value ## [1] 0.1124666 Then paired: t.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.04528869 These were based on assuming normals (independent samples in the first case, not so in the second).
Horseshoe crab counts - paired t-test
The advantage pairing can be seen in testing the hypothesis H0 : E(X) = E(Y ) via a t-test. First unpaired: t.test(Crabs ~ Year, data = horseshoe)$p.value ## [1] 0.1124666 Then paired: t.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.04528869 These were based on assuming normals (independent samples in the first case, not so in the second). The histograms for 2011 and 2012 did not look that symmetric, let alone
- Gaussian. Although the difference is a little better.
Horseshoe crab counts - paired t-test
Could have transformed the counts, say to the square root scale effectively testing a different hypothesis, namely H0 : E( √ X) = E( √ Y ) via the t-tests.
Horseshoe crab counts - paired t-test
Could have transformed the counts, say to the square root scale effectively testing a different hypothesis, namely H0 : E( √ X) = E( √ Y ) via the t-tests. First unpaired: t.test(sqrt(Crabs) ~ Year, data = horseshoe)$p.value ## [1] 0.1467886
Horseshoe crab counts - paired t-test
Could have transformed the counts, say to the square root scale effectively testing a different hypothesis, namely H0 : E( √ X) = E( √ Y ) via the t-tests. First unpaired: t.test(sqrt(Crabs) ~ Year, data = horseshoe)$p.value ## [1] 0.1467886 Then paired: t.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.02448857
Horseshoe crab counts - paired t-test
Could have transformed the counts, say to the square root scale effectively testing a different hypothesis, namely H0 : E( √ X) = E( √ Y ) via the t-tests. First unpaired: t.test(sqrt(Crabs) ~ Year, data = horseshoe)$p.value ## [1] 0.1467886 Then paired: t.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.02448857 Normality seems easier to justify on the square root scale (independent samples in the first case, not so in the second).
Horseshoe crab counts - paired t-test
Could have transformed the counts, say to the square root scale effectively testing a different hypothesis, namely H0 : E( √ X) = E( √ Y ) via the t-tests. First unpaired: t.test(sqrt(Crabs) ~ Year, data = horseshoe)$p.value ## [1] 0.1467886 Then paired: t.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.02448857 Normality seems easier to justify on the square root scale (independent samples in the first case, not so in the second). Normal or not, in both cases the pairing makes it easier to detect a difference.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R .
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R . wilcox.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01874471
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R . wilcox.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01874471 Note that here too had we first transformed to a square root scale, a different result would be had.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R . wilcox.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01874471 Note that here too had we first transformed to a square root scale, a different result would be had. wilcox.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01246649
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R . wilcox.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01874471 Note that here too had we first transformed to a square root scale, a different result would be had. wilcox.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01246649 All of these test slightly different hypotheses.
Horseshoe crab counts - paired Wilcoxon signed rank test
Alternatively, we might choose a test simply based on the signed ranks of the | zi |s. A test based on this is the Wilcoxon signed rank test. Basically it replaces the zis by wi = sign(zi) × rank(| zi |) and uses the | w | as a discrepancy. The Wilcoxon signed rank test is often recommended when the distribution of the differences zi is not likely to be normal. This too can be carried out easily in R . wilcox.test(Crabs ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01874471 Note that here too had we first transformed to a square root scale, a different result would be had. wilcox.test(sqrt(Crabs) ~ Year, data = horseshoe, paired = TRUE)$p.value ## [1] 0.01246649 All of these test slightly different hypotheses. And all paired tests are indicating evidence against the hypothesis that the corresponding expectations are identical.
Horseshoe crab counts - pairing and simulation
Since the pairing brings us back to a univariate distribution, we might use the zeroed mean sample quantile function of the zis to generate new data (from a distribution of the same shape but zero mean).
Horseshoe crab counts - pairing and simulation
Since the pairing brings us back to a univariate distribution, we might use the zeroed mean sample quantile function of the zis to generate new data (from a distribution of the same shape but zero mean). The discrepancy measure is simply d = | z | discrepancyAbsAve <- function(data) {abs(mean(data))}
Horseshoe crab counts - pairing and simulation
Since the pairing brings us back to a univariate distribution, we might use the zeroed mean sample quantile function of the zis to generate new data (from a distribution of the same shape but zero mean). The discrepancy measure is simply d = | z | discrepancyAbsAve <- function(data) {abs(mean(data))} And the test would be x <- horseshoe$Crabs[horseshoe$Year == 2011] y <- horseshoe$Crabs[horseshoe$Year == 2012] z <- x - y numericalTest(z, discrepancyFn = discrepancyAbsAve, generateFn = generateFromMeanShiftData) ## [1] 0.0055
Horseshoe crab counts - pairing and simulation
Since the pairing brings us back to a univariate distribution, we might use the zeroed mean sample quantile function of the zis to generate new data (from a distribution of the same shape but zero mean). The discrepancy measure is simply d = | z | discrepancyAbsAve <- function(data) {abs(mean(data))} And the test would be x <- horseshoe$Crabs[horseshoe$Year == 2011] y <- horseshoe$Crabs[horseshoe$Year == 2012] z <- x - y numericalTest(z, discrepancyFn = discrepancyAbsAve, generateFn = generateFromMeanShiftData) ## [1] 0.0055 Again indicating (even stronger) evidence against the hypothesis H0 : E(Z) = 0.
Horseshoe crab counts - pairing and simulation
Since the pairing brings us back to a univariate distribution, we might use the zeroed mean sample quantile function of the zis to generate new data (from a distribution of the same shape but zero mean). The discrepancy measure is simply d = | z | discrepancyAbsAve <- function(data) {abs(mean(data))} And the test would be x <- horseshoe$Crabs[horseshoe$Year == 2011] y <- horseshoe$Crabs[horseshoe$Year == 2012] z <- x - y numericalTest(z, discrepancyFn = discrepancyAbsAve, generateFn = generateFromMeanShiftData) ## [1] 0.0055 Again indicating (even stronger) evidence against the hypothesis H0 : E(Z) = 0.
Note: This numerical test has not as yet been subjected to sufficient scrutiny for it to be generally
- recommended. Certainly, its value will improve with increasing n. Nevertheless, at least for
exploratory work by a knowledgeable analyst, it may be helpful. C.f. Bootstrapping and permuting paired t-test type statistics (Statistics and Computing, 2014).
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on.
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated?
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other?
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates?
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates? Of these, the last two really require some modelling of the data to answer.
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates? Of these, the last two really require some modelling of the data to answer. The first also requires some modelling.
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates? Of these, the last two really require some modelling of the data to answer. The first also requires some modelling. The problem is it isn’t clear how to generate data under the null hypothesis that ρX,Y = 0 without imposing some fairly detailed structure on FX,Y (x, y).
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates? Of these, the last two really require some modelling of the data to answer. The first also requires some modelling. The problem is it isn’t clear how to generate data under the null hypothesis that ρX,Y = 0 without imposing some fairly detailed structure on FX,Y (x, y). For example, if FX,Y (x, y) is a bivariate normal distribution, then ρX,Y = 0 if and only if FX,Y (x, y) = FX(x) × FY (y). That is, X and Y are independent (X⊥ ⊥Y ).
More typically paired data
Unlike the horseshoe crab counts, (x, y) pairs are more typically measurements
- n two completely different variates.
For example, (wt, mpg) from the mtcars dataset, and so on. For such data, the more interesting questions include:
◮ Are the variates correlated? A hypothesis might be H0 : ρX,Y = 0. ◮ Are the variates independently distributed? ◮ Can we predict the value of one variate given values of the other? ◮ Can we provide a simple summary of the dependency relation between the
two variates? Of these, the last two really require some modelling of the data to answer. The first also requires some modelling. The problem is it isn’t clear how to generate data under the null hypothesis that ρX,Y = 0 without imposing some fairly detailed structure on FX,Y (x, y). For example, if FX,Y (x, y) is a bivariate normal distribution, then ρX,Y = 0 if and only if FX,Y (x, y) = FX(x) × FY (y). That is, X and Y are independent (X⊥ ⊥Y ). Note that X⊥ ⊥Y = ⇒ ρX,Y = 0, but not vice versa. The bivariate normal is the exception, not the rule.
Paired data - testing for independence
The one question which can be address without posing too much additional structure on FX,Y (x, y) is that of independence – H0 : X⊥ ⊥Y .
Paired data - testing for independence
The one question which can be address without posing too much additional structure on FX,Y (x, y) is that of independence – H0 : X⊥ ⊥Y . That is because under H0, any x might be paired with any y.
Paired data - testing for independence
The one question which can be address without posing too much additional structure on FX,Y (x, y) is that of independence – H0 : X⊥ ⊥Y . That is because under H0, any x might be paired with any y. This suggests that to generate data under H0 we could just randomly permute the valued of one of them, say the yis, and then pair these up with the xis in their original order.
Paired data - testing for independence
The one question which can be address without posing too much additional structure on FX,Y (x, y) is that of independence – H0 : X⊥ ⊥Y . That is because under H0, any x might be paired with any y. This suggests that to generate data under H0 we could just randomly permute the valued of one of them, say the yis, and then pair these up with the xis in their original order. A function which would do exactly this is mixCoords <- function(data) { # Note that data needs to have named x and a y components x <- data$x y <- data$y n <- length(x) stopifnot(n == length(y)) new_y <- sample(y, n, replace=FALSE) list(x=x, y=new_y) }
Paired data - testing for independence
The one question which can be address without posing too much additional structure on FX,Y (x, y) is that of independence – H0 : X⊥ ⊥Y . That is because under H0, any x might be paired with any y. This suggests that to generate data under H0 we could just randomly permute the valued of one of them, say the yis, and then pair these up with the xis in their original order. A function which would do exactly this is mixCoords <- function(data) { # Note that data needs to have named x and a y components x <- data$x y <- data$y n <- length(x) stopifnot(n == length(y)) new_y <- sample(y, n, replace=FALSE) list(x=x, y=new_y) } This can then be used with whatever discrepancy measure might be proposed.
Testing for independence - continuous data
For example, we could check test whether mileage (mpg) and car weight (wt) are independently distributed using mtcars.
2 3 4 5 10 15 20 25 30 wt mpg
The scatterplot in itself suggests a strong dependence, as the car weight increases the mileage goes down. A simple summary of that dependence is the least-squares fitted line.
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure.
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure. The sample correlation is
- ρX,Y =
n
i=1(xi − x)(yi − y)
- (n
i=1(xi − x)2) × (n i=1(yi − y)2)
.
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure. The sample correlation is
- ρX,Y =
n
i=1(xi − x)(yi − y)
- (n
i=1(xi − x)2) × (n i=1(yi − y)2)
. If X⊥ ⊥Y then ρX,Y should be near zero (since ρX,Y must be). The larger is | ρX,Y | the greater the evidence against the hypothesis.
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure. The sample correlation is
- ρX,Y =
n
i=1(xi − x)(yi − y)
- (n
i=1(xi − x)2) × (n i=1(yi − y)2)
. If X⊥ ⊥Y then ρX,Y should be near zero (since ρX,Y must be). The larger is | ρX,Y | the greater the evidence against the hypothesis.
correlationDiscrepancy <- function(data) { # Note that data needs to have named x and a y components x <- data$x y <- data$y abs(cor(x,y)) }
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure. The sample correlation is
- ρX,Y =
n
i=1(xi − x)(yi − y)
- (n
i=1(xi − x)2) × (n i=1(yi − y)2)
. If X⊥ ⊥Y then ρX,Y should be near zero (since ρX,Y must be). The larger is | ρX,Y | the greater the evidence against the hypothesis.
correlationDiscrepancy <- function(data) { # Note that data needs to have named x and a y components x <- data$x y <- data$y abs(cor(x,y)) }
Correlation chases an linear relationship. Not surprisingly then, we might also consider fitting a straight line y = α + β + r to the data pairs and using
- β
- as
the discrepancy.
Testing for independence - continuous data
Since X⊥ ⊥Y = ⇒ ρX,Y = 0, we might even use the sample correlation as a discrepancy measure. The sample correlation is
- ρX,Y =
n
i=1(xi − x)(yi − y)
- (n
i=1(xi − x)2) × (n i=1(yi − y)2)
. If X⊥ ⊥Y then ρX,Y should be near zero (since ρX,Y must be). The larger is | ρX,Y | the greater the evidence against the hypothesis.
correlationDiscrepancy <- function(data) { # Note that data needs to have named x and a y components x <- data$x y <- data$y abs(cor(x,y)) }
Correlation chases an linear relationship. Not surprisingly then, we might also consider fitting a straight line y = α + β + r to the data pairs and using
- β
- as
the discrepancy.
slopeDiscrepancy <- function(data) { # Note that data needs to have named x and a y components fit <- lm(y ~ x, data = data) abs(fit$coef[2]) }
Testing for independence - continuous data
Using these discrepancies, and generating data under the assumption of independence, a numerical test whether of the independence of mileage (mpg) and car weight (wt) using mtcars is straightforward. Using correlation as discrepancy: numericalTest(list(x = mtcars$wt, y = mtcars$mpg), discrepancyFn = correlationDiscrepancy, generateFn = mixCoords) ## [1] 0
Testing for independence - continuous data
Using these discrepancies, and generating data under the assumption of independence, a numerical test whether of the independence of mileage (mpg) and car weight (wt) using mtcars is straightforward. Using correlation as discrepancy: numericalTest(list(x = mtcars$wt, y = mtcars$mpg), discrepancyFn = correlationDiscrepancy, generateFn = mixCoords) ## [1] 0 Using the estimate slope as discrepancy: numericalTest(list(x = mtcars$wt, y = mtcars$mpg), discrepancyFn = slopeDiscrepancy, generateFn = mixCoords) ## [1] 0 Both give extraordinarily strong evidence against independence!
Testing for independence - problems with correlation
Of course, this test is based on using discrepancy measures that essentially test for a straight line relationship as the departure from H0 : X⊥ ⊥Y .
Testing for independence - problems with correlation
Of course, this test is based on using discrepancy measures that essentially test for a straight line relationship as the departure from H0 : X⊥ ⊥Y . This may not be the best test.
Testing for independence - problems with correlation
Of course, this test is based on using discrepancy measures that essentially test for a straight line relationship as the departure from H0 : X⊥ ⊥Y . This may not be the best
- test. For example, the following pairs of variates all have the same correlation, the
same slope estimate, the same R2, . . . (Anscombe, 1973) but show very different departures from independence.
Testing for independence - problems with correlation
Of course, this test is based on using discrepancy measures that essentially test for a straight line relationship as the departure from H0 : X⊥ ⊥Y . This may not be the best
- test. For example, the following pairs of variates all have the same correlation, the
same slope estimate, the same R2, . . . (Anscombe, 1973) but show very different departures from independence.
4 6 8 10 12 14 4 5 6 7 8 9 10 11
r = 0.82
x y
4 6 8 10 12 14 3 4 5 6 7 8 9
r = 0.82
x y
4 6 8 10 12 14 6 8 10 12
r = 0.82
x y
8 10 12 14 16 18 6 8 10 12
r = 0.82
x y
Testing for independence - problems with correlation
Moreover, it is easy to construct situations where ρX,Y = 0 where it is obvious that X⊥ ⊥ / Y !
Testing for independence - problems with correlation
Moreover, it is easy to construct situations where ρX,Y = 0 where it is obvious that X⊥ ⊥ / Y !
−40 −20 20 40 500 1000 1500 2000 2500
r = 0 x y
−40 −20 20 40 −40 −20 20 40
r = 0 x y
No correlation, but clearly a relationship between X and Y !
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure.
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive.
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive. We need some idea of the sort of departure might matter most to us.
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive. We need some idea of the sort of departure might matter most to us. Correlation and straight lines capture a sort of simple dependence of the mean of
- ne variate on the other. More generally, if X⊥
⊥Y , then µ(x) = E(Y | X = x) = E(Y ).
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive. We need some idea of the sort of departure might matter most to us. Correlation and straight lines capture a sort of simple dependence of the mean of
- ne variate on the other. More generally, if X⊥
⊥Y , then µ(x) = E(Y | X = x) = E(Y ). That is, the average of y values should be the same no matter what their x
- values. The straight line model (and correlation) simply restrict consideration to
the possibility that µ(x) = α + βx
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive. We need some idea of the sort of departure might matter most to us. Correlation and straight lines capture a sort of simple dependence of the mean of
- ne variate on the other. More generally, if X⊥
⊥Y , then µ(x) = E(Y | X = x) = E(Y ). That is, the average of y values should be the same no matter what their x
- values. The straight line model (and correlation) simply restrict consideration to
the possibility that µ(x) = α + βx which will not depend on the value of x iff β = 0, hence the discrepancy measure
- β
- .
Testing for independence - choice of discrepancy measure
It is important to understand what kind of departure from the null hypothesis is actually being quantified by the chosen discrepancy measure. With a hypothesis like H0 : X⊥ ⊥Y , there infinitely many ways in which X⊥ ⊥ / Y might occur. Expecting a single numerical discrepancy measure to capture them all seems a bit naive. We need some idea of the sort of departure might matter most to us. Correlation and straight lines capture a sort of simple dependence of the mean of
- ne variate on the other. More generally, if X⊥
⊥Y , then µ(x) = E(Y | X = x) = E(Y ). That is, the average of y values should be the same no matter what their x
- values. The straight line model (and correlation) simply restrict consideration to
the possibility that µ(x) = α + βx which will not depend on the value of x iff β = 0, hence the discrepancy measure
- β
- . (
ρX,Y sort of simultaneously assesses whether E(X | Y = y) = E(X), treating X and Y symmetrically.)
Testing for independence - a loess discrepancy measure
If departures from independence by measured by mean dependency µ(x) = E(Y | X = x) is of some import, then we might want to choose a more flexible model than a simple straight line.
Testing for independence - a loess discrepancy measure
If departures from independence by measured by mean dependency µ(x) = E(Y | X = x) is of some import, then we might want to choose a more flexible model than a simple straight line. We might for example choose a smooth model for µ(x) like that given by loess().
Testing for independence - a loess discrepancy measure
If departures from independence by measured by mean dependency µ(x) = E(Y | X = x) is of some import, then we might want to choose a more flexible model than a simple straight line. We might for example choose a smooth model for µ(x) like that given by loess(). One such discrepancy measure would be
loessDiscrepancy <- function(data){ fit <- loess(y ~ x, data = data, # default span family = "symmetric") # robust family sd(fit$fitted)/sd(data$y) # proportion of sd of ys explained by mu }
The larger this is, the better the fit is of the smooth µ(x) is to y.
Testing for independence - a loess discrepancy measure
If departures from independence by measured by mean dependency µ(x) = E(Y | X = x) is of some import, then we might want to choose a more flexible model than a simple straight line. We might for example choose a smooth model for µ(x) like that given by loess(). One such discrepancy measure would be
loessDiscrepancy <- function(data){ fit <- loess(y ~ x, data = data, # default span family = "symmetric") # robust family sd(fit$fitted)/sd(data$y) # proportion of sd of ys explained by mu }
The larger this is, the better the fit is of the smooth µ(x) is to y.
Note:
◮ using a ratio of standard deviations instead of usual variance ◮ could have used square roots of sums of squares instead ◮ when using mixCoords() the sd(y) never changes so could just use fitted values
alone
Loess discrepancy measure - examples
Mileage versus weight (mtcars):
2 3 4 5 10 15 20 25 30 wt mpg least−squares line loess smooth
Loess discrepancy measure - examples
Mileage versus weight (mtcars):
2 3 4 5 10 15 20 25 30 wt mpg least−squares line loess smooth
numericalTest(list(x = mtcars$wt, y = mtcars$mpg), discrepancyFn = loessDiscrepancy, generateFn = mixCoords) ## [1] 0.001
Loess discrepancy measure - examples
Horseshoe crabs
50000 100000 150000 200000 250000 300000 350000 20000 40000 60000 80000 120000
Horseshoe crab counts
2011 2012 least−squares line loess smooth
Loess discrepancy measure - examples
Horseshoe crabs
50000 100000 150000 200000 250000 300000 350000 20000 40000 60000 80000 120000
Horseshoe crab counts
2011 2012 least−squares line loess smooth
y2011 <- horseshoe$Year == 2011 y2012 <- horseshoe$Year == 2012 numericalTest(list(x = horseshoe$Crabs[y2011], y = horseshoe$Crabs[y2012]), discrepancyFn = loessDiscrepancy, generateFn = mixCoords) ## [1] 0.0055
Loess discrepancy measure - examples
Horseshoe crabs - square root scale
100 200 300 400 500 600 50 100 150 200 250 300 350
Horseshoe crab counts
2011 2012 least−squares line loess smooth
Loess discrepancy measure - examples
Horseshoe crabs - square root scale
100 200 300 400 500 600 50 100 150 200 250 300 350
Horseshoe crab counts
2011 2012 least−squares line loess smooth
y2011 <- horseshoe$Year == 2011 y2012 <- horseshoe$Year == 2012 numericalTest(list(x = sqrt(horseshoe$Crabs[y2011]), y = sqrt(horseshoe$Crabs[y2012])), discrepancyFn = loessDiscrepancy, generateFn = mixCoords) ## [1] 5e-04
Loess discrepancy measure - examples
Facebook data
3.0 3.5 4.0 4.5 5.0 5.5 6.0 1 2 3
log10(Impressions + 1) log10(like + 1) least−squares line loess smooth
Loess discrepancy measure - examples
Facebook data
3.0 3.5 4.0 4.5 5.0 5.5 6.0 1 2 3
log10(Impressions + 1) log10(like + 1) least−squares line loess smooth
numericalTest(list(x = log10(facebook$Impressions + 1), y = log10(facebook$like + 1)), discrepancyFn = loessDiscrepancy, generateFn = mixCoords) ## [1] NA
Graphical discrepancy measures
How about a simple scatterplot?
Graphical discrepancy measures
How about a simple scatterplot? showScatter <- function(data, subjectNo) { plot(y ~ x, data = data, main = paste(subjectNo), cex.main = 4, xlab = "", ylab = "", pch = 19, cex = 2, col = adjustcolor("steelblue", 0.75), xaxt = "n", yaxt = "n" # turn off axes ) }
Graphical discrepancy measures
How about a simple scatterplot? showScatter <- function(data, subjectNo) { plot(y ~ x, data = data, main = paste(subjectNo), cex.main = 4, xlab = "", ylab = "", pch = 19, cex = 2, col = adjustcolor("steelblue", 0.75), xaxt = "n", yaxt = "n" # turn off axes ) } The line up test would then be data <- list(x = mtcars$wt, y = mtcars$mpg) lineup(data, generateSubject = mixCoords, showSubject = showScatter, layout=c(4, 5))
Graphical discrepancy measures - simple scatterplot
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(1.25823822102331e+107, base=21) - 68
Graphical discrepancy measures - simple scatterplot
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 True Location: log(1.25823822102331e+107, base=21) - 68 = 13
Graphical discrepancy measures - two dimensional density estimates
When there are a lot of points the pattern might be obscured by over striking. High density regions might appear the same as some lower density regions.
Graphical discrepancy measures - two dimensional density estimates
When there are a lot of points the pattern might be obscured by over striking. High density regions might appear the same as some lower density regions. As with one-dimensional data, we could construct a kernel density estimate.
- f (x, y) = 1
n
n
- i=1
Kh(xi, yi; x, y) where, for example, Kh(· · · ) is a bivariate kernel function like a bivariate normal density: Kh(xi, yi; x, y) = 1 2πhxhy exp
- −1
2
xi − x
hx
2
+
- yi − y
hy
2
This function is radially symmetric about the point (x, y). Values of hx and hy usually chosen separately to match the scale in x and in y, respectively. (These will appear to be radially symmetric in the plot.)
Graphical discrepancy measures - two dimensional density estimates
When there are a lot of points the pattern might be obscured by over striking. High density regions might appear the same as some lower density regions. As with one-dimensional data, we could construct a kernel density estimate.
- f (x, y) = 1
n
n
- i=1
Kh(xi, yi; x, y) where, for example, Kh(· · · ) is a bivariate kernel function like a bivariate normal density: Kh(xi, yi; x, y) = 1 2πhxhy exp
- −1
2
xi − x
hx
2
+
- yi − y
hy
2
This function is radially symmetric about the point (x, y). Values of hx and hy usually chosen separately to match the scale in x and in y, respectively. (These will appear to be radially symmetric in the plot.) A two-dimensional kernel-density analogous to a one-dimensional density estimate is available in the R package MASS.
Graphical discrepancy measures - kde2d(...)
The function kde2d(...) from the MASS package constructs the density estimate which now must be plotted as contours of constant density.
library(MASS) x <- mtcars$wt y <- mtcars$mpg # Two dimensional kernel density estimate den <- kde2d(x=x, y=y, n=100 ) zlim <- range(den$z) plot(x, y, pch=19, xlab = "wt", ylab = "mpg", col=adjustcolor("steelblue", 0.5)) contour(den$x, den$y, den$z, col="grey10", levels = pretty(zlim, 10), lwd=1, add=TRUE)
2 3 4 5 10 15 20 25 30 wt mpg
. 5 . 5 0.005 0.01 . 1 0.015 0.02 0.025 0.03 . 3 5 0.04