The Detection of Defective Members of Large Populations
November 21, 2019
The Detection of Defective Members of Large Populations November - - PowerPoint PPT Presentation
The Detection of Defective Members of Large Populations November 21, 2019 This is me PhD Student at Stanford, ex-engineer Outline The paper What makes the paper work? How its ideas can be reused Group Testing The setting is World
The Detection of Defective Members of Large Populations
November 21, 2019
This is me
PhD Student at Stanford, ex-engineer
Outline
Group Testing
The setting is World War IIβ¦
π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘ π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Group Testing
The setting is World War IIβ¦
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Sick :(
π€£ π€‘
Group Testing
The setting is World War IIβ¦
Group Testing
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Sick :(
π€£ π€‘ π
Group Testing
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Sick :(
π€£ π€‘ π
Donβt need individual tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘
Donβt need individual tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘
Sick :(
OkDonβt need individual tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘
Sick :(
OkDonβt need individual tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘
Sick :(
OkDonβt need individual tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘
We know this person is sickSick :(
OkNeed to carefully design tests
π€‘ π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
We canβt distinguish these twoSick :(
OkNeed to carefully design tests
π€‘ π€‘ π€‘ π€‘ π€‘ π€£ π€‘ π€‘
We canβt distinguish these twoGroup Testing Problem
We have n items, at most s of which are βsick.β Definition: A test returns whether a subset of items includes any sick items or not. Problem: Construct a set of tests which can identify a worst-case set of at most s sick items.
A better design
If every column is unique, we win
π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
A better design
If every column is unique, we win
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
Ok Ok Sick
A better design
If every column is unique, we win
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
Ok Sick Sick
A better design
If every column is unique, we win
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
1 1 1 1 1 1 1 1 1 1 1 1Ok Sick Ok
What we just saw
If there is one sick person, we can find them non-adaptively with log n tests!
Dorfmanβs Construction
This seems hard, so letβs just do something totally random Will show this works with decent probability and O(s2 log n) tests
Why is s2log n tests cool?
100 80 60 40 20Why is s2log n tests cool?
100 80 60 40 20 Way fewer tests!Dorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sDorfmanβs construction π€‘ π€‘ π€‘ π€‘ π€‘ π€‘ π€‘
Include with probability 1/sOk Sick Ok
First idea: finding healthy people
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
Ok Sick Ok
First idea: finding healthy people
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
These tests pass
Ok Sick Ok
First idea: finding healthy people
π€‘ π€‘ π€‘ π€£ π€‘ π€‘ π€‘
These tests pass These people cannot be sick!
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) =
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) =
in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) =
in test w/ p. 1/s in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) =
in test w/ p. 1/s in test w/ p. 1/s in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) = (1-1/s)s
in test w/ p. 1/s in test w/ p. 1/s in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) = (1-1/s)s β e-s/s
in test w/ p. 1/s in test w/ p. 1/s in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) = (1-1/s)s β e-s/s β 1/3
in test w/ p. 1/s in test w/ p. 1/s in test w/ p. 1/sFirst idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Should not be in the testπ€£ π€― πΈ π€ π€‘ π
Should be in testπ₯΄ π€ π
What is the probability this happens? P(none in test) = (1-1/s)s β e-s/s β 1/3
in test w/ p. 1/s in test w/ p. 1/s in test w/ p. 1/sIdea: not too many sick people, so pretty good probability of missing βem all
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Not in test w/ probability 1/3π€£ π€― πΈ π€ π€‘ π π₯΄ π€ π
Should be in testNeed this person in test
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Not in test w/ probability 1/3π€£ π€― πΈ π€ π€‘ π π₯΄ π€ π
Should be in testWhat is the probability this happens? P(π in test) = 1/s
Need this person in test
First idea: finding healthy people
For each set of sick people, need to be able to prove each other person is healthy
Not in test w/ probability 1/3π€£ π€― πΈ π€ π€‘ π π₯΄ π€ π
Should be in testWhat is the probability the test works? P(none in test and π in test) β 1/3s
Repeating tests π€‘ π€ π π
Works with probability 1/3s Works with probability 1/3s β¦π€£ π€― πΈ
Repeating tests π€‘ π€ π π
Works with probability 1/3s Works with probability 1/3s β¦π€£ π€― πΈ
What is the probability no test works? P(no test works) = (1-1/3s)T
Repeating tests π€‘ π€ π π
Works with probability 1/3s Works with probability 1/3s β¦π€£ π€― πΈ
What is the probability no test works? P(no test works) = (1-1/3s)T β e-T/3s
Repeating tests π€‘ π€ π π
Works with probability 1/3s Works with probability 1/3s β¦π€£ π€― πΈ
What is the probability no test works? P(no test works) = (1-1/3s)T β e-T/3s β n-2s T = 6s2logn
Union bound π€‘ π€ π π π€£ π€― πΈ
We just saw P(no test works for π and π€£,π€―,πΈ) β n-2s
Union bound π€‘ π€ π π π€£ π€― πΈ
We just saw P(no test works for π and π€£,π€―,πΈ) β n-2s But we havenβt dealt with π€‘,π€,π
Dorfmanβs Construction
It works! With good probability! And very few tests!!
Why did this work?
Key components
1. Not too many things to find
Coin weighting Compressed Sensing Traitor Tracing Streaming Algorithms Johnson- Lindenstrauss Mastermind Network Tomography Finding wifi users IP Traceback Error correcting codes Multicast Message Authentication
Coin weighting Compressed Sensing Traitor Tracing Streaming Algorithms Johnson- Lindenstrauss Mastermind Network Tomography Finding wifi users IP Traceback Error correcting codes Multicast Message Authentication
Coin weighting Compressed Sensing Traitor Tracing Streaming Algorithms Johnson- Lindenstrauss Mastermind Network Tomography Finding wifi users IP Traceback Error correcting codes Multicast Message Authentication
Joint work with Mary Wootters
A network
A network, failing
π¦
Finding failures
π¦
π
Finding failures
π¦
Finding failures
π¦
π
Finding failures
π¦
π
Tomography problem
We have a graph G=(V,E) with n edges, at most s edges are sick. Definition: A graph-constrained test returns whether any edges in a connected subset of edges are sick or not. Problem: Construct a set of graph-constrained tests which can identify any set of at most s sick edges.
This seems tricky
Which is sick?π π¦
This seems tricky
Theorem [Harvey et al 2007]: For the line graph on n nodes, about n/2 tests required Proof: Each neighboring pair of edges must be separated by some test. Each test is a path and can only separate two pairs. There are about n pairs.
Key components?
1. Not too many things to find
Key components?
1. Not too many things to find
π
Key components?
1. Not too many things to find
π π
Key components?
1. Not too many things to find
π π π
Our informal result
If a graph is sufficiently well-enough connected, we can find any set of s sick edges using O(s2 log n) tests
Our informal result
If a graph is sufficiently well-enough connected, we can find any set of s sick edges using O(s2 log n) tests
Same as group testing
Algorithm
For 1β¦2s2log n:
p ~ 1/s
as tests
14 15 17 22 19 24 25 26 16 18 20 21 23 27 3 4 7 8 9 12 13 1 11 2 6 5 10Example: K4 (6 edges)
Example: K4 (6 edges)
Example: K10 (45 edges)
Coin weighting Compressed Sensing Traitor Tracing Streaming Algorithms Johnson- Lindenstrauss Mastermind Network Tomography Finding wifi users IP Traceback Error correcting codes Multicast Message Authentication
Coin weighting Compressed Sensing Traitor Tracing Streaming Algorithms Johnson- Lindenstrauss Mastermind Network Tomography Finding wifi users IP Traceback Error correcting codes Multicast Message Authentication
Key components
1. Not too many things to find
Thanks!