FDR and Online FDR Adel Javanmard and Andrea Montanari USC and - - PowerPoint PPT Presentation

fdr and online fdr
SMART_READER_LITE
LIVE PREVIEW

FDR and Online FDR Adel Javanmard and Andrea Montanari USC and - - PowerPoint PPT Presentation

FDR and Online FDR Adel Javanmard and Andrea Montanari USC and Stanford December 11, 2015 Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 1 / 34 Outline Large-scale Hypothesis Testing 1 Controlling FDR 2 Controlling


slide-1
SLIDE 1

FDR and Online FDR

Adel Javanmard and Andrea Montanari

USC and Stanford

December 11, 2015

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 1 / 34

slide-2
SLIDE 2

Outline

1

Large-scale Hypothesis Testing

2

Controlling FDR

3

Controlling Online FDR

4

Conclusion

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 2 / 34

slide-3
SLIDE 3

Large-scale Hypothesis Testing

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 3 / 34

slide-4
SLIDE 4

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-5
SLIDE 5

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-6
SLIDE 6

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-7
SLIDE 7

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-8
SLIDE 8

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-9
SLIDE 9

Assume

◮ I am the CTO of a big web company ◮ ✙ 1000 data scientists ◮ ✙ 1000 ‘brilliant ideas’ per day

◮ Users are more likely to click on the first search result ◮ Users are more likely to on top right ads ◮ Users are more engaged with page layout A

◮ How to avoid wasting company resources?

Compute ‘significance level’ from data!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 4 / 34

slide-10
SLIDE 10

Example

Idea: Users click more on the first search result than on the second Null H0: Users are equaly likely to click on first and second Data:

◮ n events ◮ n1 clicks on the first result ◮ n2 ❂ n n1 clicks on the second result

Idea H0 ✮ z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮

◮ If z ✢ 1, then declare it significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 5 / 34

slide-11
SLIDE 11

Example

Idea: Users click more on the first search result than on the second Null H0: Users are equaly likely to click on first and second Data:

◮ n events ◮ n1 clicks on the first result ◮ n2 ❂ n n1 clicks on the second result

Idea H0 ✮ z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮

◮ If z ✢ 1, then declare it significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 5 / 34

slide-12
SLIDE 12

Example

Idea: Users click more on the first search result than on the second Null H0: Users are equaly likely to click on first and second Data:

◮ n events ◮ n1 clicks on the first result ◮ n2 ❂ n n1 clicks on the second result

Idea H0 ✮ z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮

◮ If z ✢ 1, then declare it significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 5 / 34

slide-13
SLIDE 13

Example

Idea: Users click more on the first search result than on the second Null H0: Users are equaly likely to click on first and second Data:

◮ n events ◮ n1 clicks on the first result ◮ n2 ❂ n n1 clicks on the second result

Idea H0 ✮ z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮

◮ If z ✢ 1, then declare it significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 5 / 34

slide-14
SLIDE 14

Formally

z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮ p-value (G ✘ N✭0❀ 1✮) p ✑ P✭G ✕ z✮ ❂

❩ ✶

z

ex 2❂2 ♣ 2✙ dx

◮ Null:

p ✘ Uniform✭❬0❀ 1❪✮

(Definition)

◮ Small p:

significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 6 / 34

slide-15
SLIDE 15

Formally

z ✑ n1 n2 ♣n ✙ N✭0❀ 1✮ p-value (G ✘ N✭0❀ 1✮) p ✑ P✭G ✕ z✮ ❂

❩ ✶

z

ex 2❂2 ♣ 2✙ dx

◮ Null:

p ✘ Uniform✭❬0❀ 1❪✮

(Definition)

◮ Small p:

significant

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 6 / 34

slide-16
SLIDE 16

Company policy

Bring your idea up only if p ✔ ☛

[☛ ❂ 0✿05, Fisher’s rule of thumb]

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 7 / 34

slide-17
SLIDE 17

Company policy

Bring your idea up only if p ✔ ☛

[☛ ❂ 0✿05, Fisher’s rule of thumb]

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 7 / 34

slide-18
SLIDE 18

Problem

◮ M ✙ 1000 hypotheses per day ◮ M☛ ✙ 1000 ✁ 0✿05 ❂ 50 pass the test ◮ Still too much waste

New company policy (Bonferroni): Bring up your idea only if p ✔ ☛M ❂ ☛❂M

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 8 / 34

slide-19
SLIDE 19

Problem

◮ M ✙ 1000 hypotheses per day ◮ M☛ ✙ 1000 ✁ 0✿05 ❂ 50 pass the test ◮ Still too much waste

New company policy (Bonferroni): Bring up your idea only if p ✔ ☛M ❂ ☛❂M

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 8 / 34

slide-20
SLIDE 20

Problem

◮ M ✙ 1000 hypotheses per day ◮ M☛ ✙ 1000 ✁ 0✿05 ❂ 50 pass the test ◮ Still too much waste

New company policy (Bonferroni): Bring up your idea only if p ✔ ☛M ❂ ☛❂M

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 8 / 34

slide-21
SLIDE 21

Problem with Bonferroni

Bring up your idea only if p ✔ ☛M ❂ ☛❂M

◮ More data scientists ✮ Less sensitive ◮ ☛ false positives per day ✮ Does not scale with M

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 9 / 34

slide-22
SLIDE 22

Problem with Bonferroni

Bring up your idea only if p ✔ ☛M ❂ ☛❂M

◮ More data scientists ✮ Less sensitive ◮ ☛ false positives per day ✮ Does not scale with M

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 9 / 34

slide-23
SLIDE 23

What do we want to achieve?

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 10 / 34

slide-24
SLIDE 24

FDR (Benjamini, Hochberg, 1995)

◮ M hypotheses ◮ D ✑ Total number of discoveries (positives) ◮ FD ✑ Number of false discoveries

FDR ❂ E

FD max✭D❀ 1✮

Interpretation: FDR ✔ 0✿1 ✮ At most 10✪ of the discoveries is false.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 11 / 34

slide-25
SLIDE 25

FDR (Benjamini, Hochberg, 1995)

◮ M hypotheses ◮ D ✑ Total number of discoveries (positives) ◮ FD ✑ Number of false discoveries

FDR ❂ E

FD max✭D❀ 1✮

Interpretation: FDR ✔ 0✿1 ✮ At most 10✪ of the discoveries is false.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 11 / 34

slide-26
SLIDE 26

Controlling FDR

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 12 / 34

slide-27
SLIDE 27

Setting

Null hypotheses: H0❀1❀ H0❀2❀ ✿ ✿ ✿ ❀ H0❀M p-values: p1❀ p2❀ ✿ ✿ ✿ ❀ pM Ground truth: ✒1❀ ✒2❀ ✿ ✿ ✿ ❀ ✒M ❬H0❀i ✿ ✒i ❂ 0❪ Test ouput (p ❂ ✭pi✮1✔i✔M : T1✭p✮❀ T2✭p✮❀ ✿ ✿ ✿ ❀ TM ✭p✮ ✷ ❢0❀ 1❣ ✒i ❂ 0 ✮ pi ✘ Uniform✭❬0❀ 1❪✮

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 13 / 34

slide-28
SLIDE 28

Setting

Null hypotheses: H0❀1❀ H0❀2❀ ✿ ✿ ✿ ❀ H0❀M p-values: p1❀ p2❀ ✿ ✿ ✿ ❀ pM Ground truth: ✒1❀ ✒2❀ ✿ ✿ ✿ ❀ ✒M ❬H0❀i ✿ ✒i ❂ 0❪ Test ouput (p ❂ ✭pi✮1✔i✔M : T1✭p✮❀ T2✭p✮❀ ✿ ✿ ✿ ❀ TM ✭p✮ ✷ ❢0❀ 1❣ ✒i ❂ 0 ✮ pi ✘ Uniform✭❬0❀ 1❪✮

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 13 / 34

slide-29
SLIDE 29

Benjamini-Hochberg procedure

◮ Order the p-values

p✭1✮ ✔ p✭2✮ ✔ ✁ ✁ ✁ ✔ p✭M✮

◮ Set threshold

I ❂ max

i ✷ ❬M❪ ✿ p✭i✮ ✔ i☛ M

◮ Reject at level p✭I ✮:

T❵✭p✮ ❂

1 if p❵ ✔ p✭I ✮,

  • therwise.

Theorem (Benjamini, Hochberg, 1995)

If the p-values are independent, and BH is used, then FDR ✔ ☛

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 14 / 34

slide-30
SLIDE 30

Benjamini-Hochberg procedure

◮ Order the p-values

p✭1✮ ✔ p✭2✮ ✔ ✁ ✁ ✁ ✔ p✭M✮

◮ Set threshold

I ❂ max

i ✷ ❬M❪ ✿ p✭i✮ ✔ i☛ M

◮ Reject at level p✭I ✮:

T❵✭p✮ ❂

1 if p❵ ✔ p✭I ✮,

  • therwise.

Theorem (Benjamini, Hochberg, 1995)

If the p-values are independent, and BH is used, then FDR ✔ ☛

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 14 / 34

slide-31
SLIDE 31

Interpretation

◮ M0 true nulls, M1 ❂ M M0 true non-null ◮ Reject H0❀i if pi ✔ q

FD ✙ M0q D ❂ J✭q✮ ✑ max❢i ✿ p✭i✮ ❁ q❣ FDR ✙ FDR✭q✮ ✑ M0q J✭q✮ ✔ Mq J✭q✮

  • FDR✭p✭I ✮✮ ✔ Mp✭I ✮

I ✔ ☛

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 15 / 34

slide-32
SLIDE 32

Interpretation

◮ M0 true nulls, M1 ❂ M M0 true non-null ◮ Reject H0❀i if pi ✔ q

FD ✙ M0q D ❂ J✭q✮ ✑ max❢i ✿ p✭i✮ ❁ q❣ FDR ✙ FDR✭q✮ ✑ M0q J✭q✮ ✔ Mq J✭q✮

  • FDR✭p✭I ✮✮ ✔ Mp✭I ✮

I ✔ ☛

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 15 / 34

slide-33
SLIDE 33

Interpretation

◮ M0 true nulls, M1 ❂ M M0 true non-null ◮ Reject H0❀i if pi ✔ q

FD ✙ M0q D ❂ J✭q✮ ✑ max❢i ✿ p✭i✮ ❁ q❣ FDR ✙ FDR✭q✮ ✑ M0q J✭q✮ ✔ Mq J✭q✮

  • FDR✭p✭I ✮✮ ✔ Mp✭I ✮

I ✔ ☛

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 15 / 34

slide-34
SLIDE 34

Controlling Online FDR

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 16 / 34

slide-35
SLIDE 35

Back to our company

BH policy: Collect M p-values every day, and run BH Problems:

◮ Centralized ◮ Controls end-of-day FDR

Not end-of-year FDR ✦ Online FDR control

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 17 / 34

slide-36
SLIDE 36

Back to our company

BH policy: Collect M p-values every day, and run BH Problems:

◮ Centralized ◮ Controls end-of-day FDR

Not end-of-year FDR ✦ Online FDR control

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 17 / 34

slide-37
SLIDE 37

Back to our company

BH policy: Collect M p-values every day, and run BH Problems:

◮ Centralized ◮ Controls end-of-day FDR

Not end-of-year FDR ✦ Online FDR control

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 17 / 34

slide-38
SLIDE 38

Setting

Null hypotheses: H0❀1❀ H0❀2❀ ✿ ✿ ✿ ❀ H0❀M Sequence of p-values: one at each time p1❀ p2❀ p3❀ ✿ ✿ ✿ Ground truth: ✒1❀ ✒2❀ ✒3❀ ✿ ✿ ✿ ❬H0❀i ✿ ✒i ❂ 0❪ Test ouput (pt

1 ❂ ✭p1❀ ✿ ✿ ✿ ❀ pt✮:

T1✭p1

1✮❀ T2✭p2 1✮❀ T3✭p3 1✮❀ ✁ ✁ ✁ ✷ ❢0❀ 1❣

[Foster, Stine, 2007]

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 18 / 34

slide-39
SLIDE 39

Setting

Null hypotheses: H0❀1❀ H0❀2❀ ✿ ✿ ✿ ❀ H0❀M Sequence of p-values: one at each time p1❀ p2❀ p3❀ ✿ ✿ ✿ Ground truth: ✒1❀ ✒2❀ ✒3❀ ✿ ✿ ✿ ❬H0❀i ✿ ✒i ❂ 0❪ Test ouput (pt

1 ❂ ✭p1❀ ✿ ✿ ✿ ❀ pt✮:

T1✭p1✮❀ T2✭p2❀ T1✮❀ T3✭p3❀ T1❀ T2✮❀ ✁ ✁ ✁ ✷ ❢0❀ 1❣ [Foster, Stine, 2007]

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 19 / 34

slide-40
SLIDE 40

What do we want to control?

◮ FD✭n✮ ✑False discoveries up to time n ◮ D✭n✮ ✑ Total number of discoveries up to time n

FDR✭n✮ ✑ E

FD✭n✮ max✭D✭n✮❀ 1✮

♦ Want FDR✭n✮ ✔ ☛ for all n, ✒

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 20 / 34

slide-41
SLIDE 41

What do we want to control?

◮ FD✭n✮ ✑False discoveries up to time n ◮ D✭n✮ ✑ Total number of discoveries up to time n

FDR✭n✮ ✑ E

FD✭n✮ max✭D✭n✮❀ 1✮

♦ Want FDR✭n✮ ✔ ☛ for all n, ✒

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 20 / 34

slide-42
SLIDE 42

Trivial approach (Bonferroni)

◮ Choose ☞i ✷ ❬0❀ 1❪, P✶ i❂1 ☞i ✔ ☛ ◮ Set

Ti ❂

1 if pi ✔ ☞i,

  • therwise.

Indeed FDR✭n✮ ✔ E❢FD✭n✮❣ ✔

i✿ ✒i❂0

P✭pi ✔ ☞i✮ ❂

i✿ ✒i❂0

☞i ✔ ☛ Very conservative!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 21 / 34

slide-43
SLIDE 43

Trivial approach (Bonferroni)

◮ Choose ☞i ✷ ❬0❀ 1❪, P✶ i❂1 ☞i ✔ ☛ ◮ Set

Ti ❂

1 if pi ✔ ☞i,

  • therwise.

Indeed FDR✭n✮ ✔ E❢FD✭n✮❣ ✔

i✿ ✒i❂0

P✭pi ✔ ☞i✮ ❂

i✿ ✒i❂0

☞i ✔ ☛ Very conservative!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 21 / 34

slide-44
SLIDE 44

Trivial approach (Bonferroni)

◮ Choose ☞i ✷ ❬0❀ 1❪, P✶ i❂1 ☞i ✔ ☛ ◮ Set

Ti ❂

1 if pi ✔ ☞i,

  • therwise.

Indeed FDR✭n✮ ✔ E❢FD✭n✮❣ ✔

i✿ ✒i❂0

P✭pi ✔ ☞i✮ ❂

i✿ ✒i❂0

☞i ✔ ☛ Very conservative!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 21 / 34

slide-45
SLIDE 45

A simple rule

LORD

(Levels based On Recent Discovery)

◮ Choose ☞i ✷ ❬0❀ 1❪, P✶ i❂1 ☞i ✔ ☛ ◮ ✜i ✑ Time of the last discovery before i ◮ Set

Ti ❂

1 if pi ✔ ☞i✜i,

  • therwise.

Each discovery resets everything.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 22 / 34

slide-46
SLIDE 46

A simple rule

LORD

(Levels based On Recent Discovery)

◮ Choose ☞i ✷ ❬0❀ 1❪, P✶ i❂1 ☞i ✔ ☛ ◮ ✜i ✑ Time of the last discovery before i ◮ Set

Ti ❂

1 if pi ✔ ☞i✜i,

  • therwise.

Each discovery resets everything.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 22 / 34

slide-47
SLIDE 47

A theorem

Theorem (Javanmard, Montanari, 2015)

If the null p-values are indepenent, then LORD satifies sup

sup

n FDR✭n✮ ✔ ☛ ✿

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 23 / 34

slide-48
SLIDE 48

Remarks

◮ Foster, Stine 2007:

◮ Introduced model ◮ Introduced alpha investing rules ◮ Proved they control mFDR (see next)

◮ Last theorem applies to generalized alpha investing ◮ LORD uses very little information on the past!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 24 / 34

slide-49
SLIDE 49

Remarks

◮ Foster, Stine 2007:

◮ Introduced model ◮ Introduced alpha investing rules ◮ Proved they control mFDR (see next)

◮ Last theorem applies to generalized alpha investing ◮ LORD uses very little information on the past!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 24 / 34

slide-50
SLIDE 50

Remarks

◮ Foster, Stine 2007:

◮ Introduced model ◮ Introduced alpha investing rules ◮ Proved they control mFDR (see next)

◮ Last theorem applies to generalized alpha investing ◮ LORD uses very little information on the past!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 24 / 34

slide-51
SLIDE 51

FDRvs mFDR

mFDR✑✭n✮ ❂ E✒❢FD✭n✮❣ E✒❢D✭n✮❣ ✰ ✑ mFDR control ✻✮ FDR control

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 25 / 34

slide-52
SLIDE 52

Example

2 2.5 3 3.5 4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 FDR mFDR

t

Data

◮ Z1❀ ✿ ✿ ✿ ❀ Zn0 ✘iid N✭0❀ 1✮, ✭Zn0✰1❀ ✿ ✿ ✿ ❀ Zn✮ ✘ N✭✒✄ 1❀ ✚11T ✰ ✚I✮ ◮ n ❂ 3000, n0 ❂ 2700, ✒✄ ❂ 2, ✚ ❂ 0✿9

Rule Ti ❂

1 if ❥Zi❥ ✕ t,

  • therwise.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 26 / 34

slide-53
SLIDE 53

Statistical power?

Two-groups model ✒i ✘iid Bernoulli✭✙✮ ❀ P✒i✭pi ✔ x✮ ❂

F✭x✮ ❂ x if ✒i ❂ 0, G✭x✮

  • therwise.

‘Discoveries should keep coming’

◮ A good rule should have D✭n✮ ❂ ✂✭n✮.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 27 / 34

slide-54
SLIDE 54

Two experiments

500 1000 1500 2000 50 100 150 200 250 300 350 400 450 500 π = 0.01 π = 0.02 π = 0.04 π = 0.08 π = 0.16 π = 0.32 500 1000 1500 2000 50 100 150 200 250 300 350 400 450 500 π = 0.01 π = 0.02 π = 0.04 π = 0.08 π = 0.16 π = 0.32

n n

◮ Left: ✒i ✘iid ✭1 ✙✮✍0 ✰ ✙N✭0❀ ✛2✮, Zi ✘ N✭0❀ ✒i✮ ◮ Right: ✒i re-ordered, decreasing ❥✒i❥

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 28 / 34

slide-55
SLIDE 55

A theorem

Theorem (Javanmard, Montanari, 2015)

Assume the two-groups model, and use of LORD. Then, almost surely lim

n✦✶

1 n D✭n✮ ✕ ❆✭G❀ ☞✮ ❀ ❆✭G❀ ☞✮ ✑

✥ ✶ ❳

k❂1

ePk

❵❂1 G✭☞❵✮

✦1

◮ ❆✭G❀ ☞✮ ❃ 0 strictly if G✭☞❵✮ ❃ ✭1 ✰ ✧✮❂❵ for all ❵ large enough. ◮ Sufficient G✭x✮ ✙ G0x 1✰✍ as x ✦ 0.

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 29 / 34

slide-56
SLIDE 56

Comparison under the Gaussian two-groups model

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.01 0.02 0.03 0.04 0.05 0.06 BH LOND LORD Bonferroni alpha−Investing

✙ FDR

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 LOND LORD Bonferroni alpha−Investing

✙ Relative power to BH

TD✭n✮ ❂ True discoveries RelativePower✭n✮ ✑ E

TD✭n✮ max✭TDBH✭n✮❀ 1✮

◮ n ❂ 1000, ✛2 ❂ 2 log n

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 30 / 34

slide-57
SLIDE 57

Conclusion

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 31 / 34

slide-58
SLIDE 58

What if I am not CTO of a big-data company?

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 32 / 34

slide-59
SLIDE 59

Take the “company” as a metaphor for science

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 33 / 34

slide-60
SLIDE 60

Conclusion

◮ FDR control is fundamental for reasoning about data ◮ Online FDR is likely more realistic

Thanks!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 34 / 34

slide-61
SLIDE 61

Conclusion

◮ FDR control is fundamental for reasoning about data ◮ Online FDR is likely more realistic

Thanks!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 34 / 34

slide-62
SLIDE 62

Conclusion

◮ FDR control is fundamental for reasoning about data ◮ Online FDR is likely more realistic

Thanks!

Andrea Montanari (Stanford) FDR and Online FDR December 11, 2015 34 / 34