Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B V0B - - PDF document

logistic regression mle vs ols1 in excel2013 29 aug 2016
SMART_READER_LITE
LIVE PREVIEW

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B V0B - - PDF document

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B V0B V0B Schield MLE vs. OLS1-Based Logistic Excel 2013 1 Schield MLE vs. OLS1-Based Logistic Excel 2013 2 Logistic Regression: Background & Goals MLE vs. OLS1 in Excel 2013


slide-1
SLIDE 1

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B 2016-Schield-Logistic-MLE-OLS1-Excel2013-Slides.pdf 1

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 1

by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy Project Director, W. M. Keck Statistical Literacy Project

Slides and data at: www.StatLit.org/

pdf/2016-Schield-Logistic-MLE-OLS1-Excel2013-slides.pdf pdf/2016-Schield-Logistic-MLE-OLS1-Excel2013-demo.pdf Excel/2016-Schield-Logistic-MLE-OLS1-Excel2013.xlsx

Logistic Regression: MLE vs. OLS1 in Excel 2013

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 2

Background & Goals

Modeling a binary outcome requires a logistic model. Doing logistic regression properly requires MLE. Doing MLE in Excel is not easy. See Schield (2015) Schield created two logistic OLS models: OLS1+OLS3 OLS1: Model Ln[Odds(Pnudge)]. See Schield (2014a). OLS2: Model Ln[Odds(Pgroup)]. See Schield (2016c) OLS3: Use OLS to estimate logistic parameters. See Schield (2014b) These slides compare MLE with OLS1 logistic regression

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 3

Model Gender by Height (OLS) Must use a logistic function

This linear trend-line goes outside the valid range

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 4

Model using a Logistic Function

Range of Odds(p): Zero to infinity Range of Ln[Odds(p)]: Minus infinity to infinity Logistic model: Ln[Odds(p)] = a + b*X

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75 Height (inches)

Logistic P(male|Height)

Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

Source: Pulse dataset

5

|Ln[Odds(p)]| = infinity if p = 0 or 1 Pnudge: =If(p=0, 0.001, 0.999)

OLS1 Model : Ln[Odds(Pnudge)] = Constant + b1*X1 + b2*X2 using Ordinary Least Squares.

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 6

1a: Logistic P(male|Height) MLE

0.25 0.5 0.75 1 60 62 64 66 68 70 72 74 76 Probability (Male) Height (inches)

Gender by Height

Pulse.xls MLE Logistic Schield Excel 2013 Women Men MLE Logistic regression P(Y|X) = 1 / {1 + Exp[‐(a + bX)]} a = ‐53.32 ; b = 0.7905. Xo = ‐a/b = 67.4529 Slope(X=Xo) = b/4 = 0.1976 X = Xo if P(Y) = 0.5

slide-2
SLIDE 2

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B 2016-Schield-Logistic-MLE-OLS1-Excel2013-Slides.pdf 2

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

7

1b: Logistic P(male|Height) OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Height): OLS1

OLS1A (dash): Ln[Odds(Male1)] = ‐88.80 + 1.316*Height Pulse data Schield V0B

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 8

1c: Logistic P(male|Height) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Height): MLE vs. OLS1

OLS1A (dash): Ln[Odds(Male1)] = ‐88.80 + 1.316*Height MLE1A (solid): Ln[Odds(Male)] = ‐55.32 + 0.7905*Height Pulse data Schield V0B

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

9

2a: Logistic P(male|Weight) MLE

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): MLE

MLE1B (solid): Ln[Odds(Male)] = ‐21.48 + 0.1577*Weight Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

10

2b: Logistic P(male|Weight) OLS1

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): OLS1

OLS1B (dash): Ln[Odds(Male)] = ‐27.57 + 0.2013*Weight Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

11

2c: Logistic P(male|Weight) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): MLE vs. OLS1

OLS1B (dash): Ln[Odds(Male)] = ‐27.57 + 0.2013*Weight MLE1B (solid): Ln[Odds(Male)] = ‐21.48 + 0.1577*Weight Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

12

3a: Logistic P(male|Ht, Wt=Ave) MLE

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): MLE

MLE1C (solid): Ln[Odds(Male)] = ‐41.40 + 0.3817*Height + 0.1146*AveWt Pulse data Schield V0A

slide-3
SLIDE 3

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B 2016-Schield-Logistic-MLE-OLS1-Excel2013-Slides.pdf 3

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

13

3b: Logistic P(male|Ht, Wt=Ave) OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): OLS1

OLS1C (dash): Ln[Odds(Male)] = ‐66.374 + 0.7586*Height + 0.1095*AveWt Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

14

3c: Logistic P(male|Ht, Wt=Ave) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): MLE vs. OLS1

OLS1B (dash): Ln[Odds(Male)] = ‐66.374 + 0.7586*Height + 0.1095*AveWt MLE1B (solid): Ln[Odds(Male)] = ‐41.40 + 0.3817*Height + 0.1146*AveWt Pulse data Schield V0A

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

  • 1. Can use OLS with a logistic function to

illustrate logistic regression since logistic- OLS is better than linear OLS.

  • 2. OLS with ‘nudge’, a single predictor and

logistic (OLS1A) is quite close to the MLE.

  • 3. OLS with ‘nudge’ and logistic worsens as

number of predictors increases (OLS1C) .

  • 4. If higher accuracy is needed, use MLE

(Excel or other) or consult a statistician.

15

Conclusions

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

Schield, Milo (2015). Copy at www.statlit.org/pdf/ 2015-Schield-Logistic-MLE1A-Excel2013-Slides.pdf 2015-Schield-Logistic-MLE1B-Excel2013-Slides.pdf 2015-Schield-Logistic-MLE1C-Excel2013-Slides.pdf Schield, Milo (2014a). Copy at www.statlit.org/pdf/ 2014-Schield-Logistic-OLS1A-Excel2013-Demo.pdf 2014-Schield-Logistic-OLS1B-Excel2013-Demo.pdf 2014-Schield-Logistic-OLS1C-Excel2013-Demo.pdf Schield, Milo (2014b). Copy at www.statlit.org/pdf/ 2014-Schield-Logistic-MLE-OLS3-Excel2013-Demo.pdf

16

References

slide-4
SLIDE 4

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 1

by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy Project Director, W. M. Keck Statistical Literacy Project

Slides and data at: www.StatLit.org/

pdf/2016-Schield-Logistic-MLE-OLS1-Excel2013-slides.pdf pdf/2016-Schield-Logistic-MLE-OLS1-Excel2013-demo.pdf Excel/2016-Schield-Logistic-MLE-OLS1-Excel2013.xlsx

Logistic Regression: MLE vs. OLS1 in Excel 2013

slide-5
SLIDE 5

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 2

Background & Goals

Modeling a binary outcome requires a logistic model. Doing logistic regression properly requires MLE. Doing MLE in Excel is not easy. See Schield (2015) Schield created two logistic OLS models: OLS1+OLS3 OLS1: Model Ln[Odds(Pnudge)]. See Schield (2014a). OLS2: Model Ln[Odds(Pgroup)]. See Schield (2016c) OLS3: Use OLS to estimate logistic parameters. See Schield (2014b) These slides compare MLE with OLS1 logistic regression

slide-6
SLIDE 6

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 3

Model Gender by Height (OLS) Must use a logistic function

This linear trend-line goes outside the valid range

slide-7
SLIDE 7

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 4

Model using a Logistic Function

Range of Odds(p): Zero to infinity Range of Ln[Odds(p)]: Minus infinity to infinity Logistic model: Ln[Odds(p)] = a + b*X

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Height)

Pulse data Schield V0A

slide-8
SLIDE 8

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

Source: Pulse dataset

5

|Ln[Odds(p)]| = infinity if p = 0 or 1 Pnudge: =If(p=0, 0.001, 0.999)

OLS1 Model : Ln[Odds(Pnudge)] = Constant + b1*X1 + b2*X2 using Ordinary Least Squares.

slide-9
SLIDE 9

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 6

1a: Logistic P(male|Height) MLE

0.25 0.5 0.75 1 60 62 64 66 68 70 72 74 76

Probability (Male) Height (inches)

Gender by Height

Pulse.xls MLE Logistic Schield Excel 2013 Women Men MLE Logistic regression P(Y|X) = 1 / {1 + Exp[-(a + bX)]} a = -53.32 ; b = 0.7905. Xo = -a/b = 67.4529 Slope(X=Xo) = b/4 = 0.1976 X = Xo if P(Y) = 0.5

slide-10
SLIDE 10

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

7

1b: Logistic P(male|Height) OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Height): OLS1

OLS1A (dash): Ln[Odds(Male1)] = -88.80 + 1.316*Height Pulse data Schield V0B

slide-11
SLIDE 11

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B 8

1c: Logistic P(male|Height) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Height): MLE vs. OLS1

OLS1A (dash): Ln[Odds(Male1)] = -88.80 + 1.316*Height MLE1A (solid): Ln[Odds(Male)] = -55.32 + 0.7905*Height Pulse data Schield V0B

slide-12
SLIDE 12

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

9

2a: Logistic P(male|Weight) MLE

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): MLE

MLE1B (solid): Ln[Odds(Male)] = -21.48 + 0.1577*Weight Pulse data Schield V0A

slide-13
SLIDE 13

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

10

2b: Logistic P(male|Weight) OLS1

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): OLS1

OLS1B (dash): Ln[Odds(Male)] = -27.57 + 0.2013*Weight Pulse data Schield V0A

slide-14
SLIDE 14

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

11

2c: Logistic P(male|Weight) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 95 115 135 155 175 195

Weight (pounds)

Logistic P(male|Weight): MLE vs. OLS1

OLS1B (dash): Ln[Odds(Male)] = -27.57 + 0.2013*Weight MLE1B (solid): Ln[Odds(Male)] = -21.48 + 0.1577*Weight Pulse data Schield V0A

slide-15
SLIDE 15

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

12

3a: Logistic P(male|Ht, Wt=Ave) MLE

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): MLE

MLE1C (solid): Ln[Odds(Male)] = -41.40 + 0.3817*Height + 0.1146*AveWt Pulse data Schield V0A

slide-16
SLIDE 16

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

13

3b: Logistic P(male|Ht, Wt=Ave) OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): OLS1

OLS1C (dash): Ln[Odds(Male)] = -66.374 + 0.7586*Height + 0.1095*AveWt Pulse data Schield V0A

slide-17
SLIDE 17

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

.

14

3c: Logistic P(male|Ht, Wt=Ave) MLE vs. OLS1

0.2 0.4 0.6 0.8 1 61 63 65 67 69 71 73 75

Height (inches)

Logistic P(male|Ht, Wt=Ave): MLE vs. OLS1

OLS1B (dash): Ln[Odds(Male)] = -66.374 + 0.7586*Height + 0.1095*AveWt MLE1B (solid): Ln[Odds(Male)] = -41.40 + 0.3817*Height + 0.1146*AveWt Pulse data Schield V0A

slide-18
SLIDE 18

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

  • 1. Can use OLS with a logistic function to

illustrate logistic regression since logistic- OLS is better than linear OLS.

  • 2. OLS with ‘nudge’, a single predictor and

logistic (OLS1A) is quite close to the MLE.

  • 3. OLS with ‘nudge’ and logistic worsens as

number of predictors increases (OLS1C) .

  • 4. If higher accuracy is needed, use MLE

(Excel or other) or consult a statistician.

15

Conclusions

slide-19
SLIDE 19

Schield MLE vs. OLS1-Based Logistic Excel 2013

V0B

Schield, Milo (2015). Copy at www.statlit.org/pdf/ 2015-Schield-Logistic-MLE1A-Excel2013-Slides.pdf 2015-Schield-Logistic-MLE1B-Excel2013-Slides.pdf 2015-Schield-Logistic-MLE1C-Excel2013-Slides.pdf Schield, Milo (2014a). Copy at www.statlit.org/pdf/ 2014-Schield-Logistic-OLS1A-Excel2013-Demo.pdf 2014-Schield-Logistic-OLS1B-Excel2013-Demo.pdf 2014-Schield-Logistic-OLS1C-Excel2013-Demo.pdf Schield, Milo (2014b). Copy at www.statlit.org/pdf/ 2014-Schield-Logistic-MLE-OLS3-Excel2013-Demo.pdf

16

References