average individual fairness
play

Average Individual Fairness Aaron Roth Based on Joint Work with: - PowerPoint PPT Presentation

Average Individual Fairness Aaron Roth Based on Joint Work with: Michael Kearns and Saeed Sharifimalvajerdi SAT Score GPA Population 1 Population 2 SAT Score GPA Population 1 Population 2 SAT Score GPA Population 1 Population 2


  1. Average Individual Fairness Aaron Roth Based on Joint Work with: Michael Kearns and Saeed Sharifimalvajerdi

  2. SAT Score GPA Population 1 Population 2

  3. SAT Score GPA Population 1 Population 2

  4. SAT Score GPA Population 1 Population 2

  5. Why was the classifier “unfair”? Question : Who was harmed? Possible Answer : The qualified applicants mistakenly rejected. False Negative Rate : The rate at which harm is done. Fairness : Equal false negative rates across groups? [Chouldechova], [Hardt, Price, Srebro], [Kleinberg, Mullainathan, Raghavan] Statistical Fairness Definitions: 1. Partition the world into groups (often according to a “protected attribute”) 2. Pick your favorite statistic of a classifier. 3. Ask that the statistic be (approximately) equalized across groups.

  6. But… • A classifier equalizes false negative rates. What does it promise you? • The rate in false negative rate assumes you are a uniformly random member of your population. • If you have reason to believe otherwise, it promises you nothing…

  7. For example • Protected subgroups: “Men”, “Women”, “Blue”, “Green”. Labels are independent of attributes. • The following allocation equalizes false negative rates across all four groups. Blue Green Male Female

  8. Sometimes individuals are subject to more than one classification task…

  9. The Idea • Postulate a distribution over problems and individuals . • Ask for a mapping between problems and classifiers that equalizes false negative rates across every pair of individuals. • Redefine rate : Averaged over the problem distribution . An individual definition of fairness.

  10. A Formalization • An unknown distribution 𝑄 over individuals 𝑦 𝑗 ∈ 𝑌 • An unknown distribution 𝑅 over problems 𝑔 𝑘 : 𝑌 → {0,1} , 𝑔 𝑘 ∈ 𝐺 • A hypothesis class 𝐼 ⊆ 0,1 𝑌 (Note 𝑔 𝑘 ’s not necessarily in 𝐼 ) • Task: Find a mapping from problems to hypotheses 𝜔 ∈ Δ𝐼 𝐺 • A new “problem” will be represented as a new labelling of the training set. • Finding the hypothesis corresponding to a new problem shouldn’t require resolving old problems. (Allows online decision making)

  11. What to Hope For (Computationally) • Machine learning learning is already computationally hard [KSS92,KS08,FGKP09,FGPW14,…] even for simple classes like halfspaces. • So we shouldn’t hope for an algorithm with worst - case guarantees… • But we might hope for an efficient reduction to unconstrained (weighted) learning problems. • “Oracle Efficient Algorithms” • This design methodology often results in practical algorithms.

  12. Computing the Optimal Empirical Solution. 1 = 1/𝑜 for each 𝑗 ∈ {1, … , 𝑜} Initialize 𝜇 𝑗 log 𝑜 For 𝑢 = 1 to 𝑈 = 𝑃 𝜗 2 • Learner Best Responds : 𝑜 𝑢 = 𝐵(𝑇 𝑢 = 𝑢 + 1 𝑢 ) for 𝑇 • For each problem 𝑘 , solve the learning problem ℎ 𝑘 𝜇 𝑗 𝑜 , 𝑦 𝑗 , 𝑔 𝑘 𝑦 𝑗 𝑘 𝑘 𝑗=1 • Set 𝛿 𝑢 = 𝟐[σ 𝑗 𝑜 𝜇 𝑗 𝑢 ≥ 0] • Auditor Updates Weights : 𝑢 by (𝑓𝑠𝑠 𝑦 𝑗 , ℎ 𝑢 , ෠ 𝑢+1 . • Multiply 𝜇 𝑗 𝑅 − 𝛿) for each expert 𝑗 and renormalize to get updated weights 𝜇 𝑗 𝑢 for each person 𝑗 and step 𝑢 . Output the weights 𝜇 𝑗

  13. Defining 𝜔 • Parameterized by the sequence of dual variables 𝜇 𝑈 = 𝜇 𝑢 𝑈 𝑢=1 𝜔 𝜇 𝑈 𝑔 : For 𝑢 = 1 to T 𝑜 𝑢 + • Solve the learning problem ℎ 𝑢 = 𝐵(𝑇 𝑢 ) for 𝑇 𝑢 = 1 𝜇 𝑗 𝑜 , 𝑦 𝑗 , 𝑔 𝑦 𝑗 𝑗=1 Output 𝑞 𝑔 ∈ Δ𝐼 where 𝑞 𝑔 is uniform over ℎ 𝑢 𝑈 𝑢=1 (Consistent with ERM solution)

  14. Computing the Optimal Empirical Solution. log 𝑜 Theorem : After 𝑃 𝑛 ⋅ calls to the learning oracle, the algorithm 𝜗 2 returns a solution 𝑞 ∈ Δ𝐼 𝑛 that achieves empirical error at most: 𝑃𝑄𝑈 𝛽, ෠ 𝑄, ෠ 𝑅 + 𝜗 and satisfies for every 𝑗, 𝑗 ′ ∈ {1, … 𝑜} : 𝐺𝑂 𝑦 𝑗 , 𝑞, ෠ 𝑅 − 𝐺𝑂 𝑦 𝑗 ′ , 𝑞, ෠ 𝑅 ≤ 𝛽 + 𝜗

  15. Generalization: Two Directions 𝑅 ෠ 𝑅 𝑔 … 𝑔 1 𝑛 𝑦 1 ෠ ⋮ 𝑄 S 𝑦 𝑜 𝑄 S’

  16. Generalization Theorem : Assuming 1 1 1) 𝑛 ≥ poly log 𝑜 , 𝜗 , log 𝜀 , 1 1 1 2) 𝑜 ≥ 𝑞𝑝𝑚𝑧 𝑛, 𝑊𝐷𝐸𝐽𝑁 𝐼 , 𝜗 , 𝛾 , log 𝜀 the algorithm returns a solution 𝜔 that with probability 1 − 𝜀 achieves error at most: 𝑃𝑄𝑈 𝛽, 𝑄, 𝑅 + 𝜗 and is such that with probability 1 − 𝛾 over 𝑦, 𝑦 ′ ∼ 𝑄 : − 𝐺𝑂 𝑦 ′ , 𝜔, 𝑅 𝐺𝑂 𝑦, 𝜔, 𝑅 ≤ 𝛽 + 𝜗

  17. Does it work? • It is important to experimentally verify “oracle efficient” algorithms, since it is possible to abuse the model. • E.g. use learning oracle as an arbitrary NP oracle. • A brief “Sanity Check” experiment: • Dataset: Communities and Crime • First 50 features are designated as “problems” (i.e. labels to predict) • Remaining features treated as features for learning.

  18. Takeaways • We should think carefully about what definitions of “fairness” really promise to individuals. • Making promises to individuals is sometimes possible, even without making heroic assumptions. • Once we fix a definition, there is often an interesting algorithm design problem. • Once we have an algorithm, we can have the tools to explore inevitable tradeoffs .

  19. Thanks! Average Individual Fairness: Algorithms, Generalization and Experiments Michael Kearns, Aaron Roth, Saeed Sharifimalvajerdi Shameless book plug: The Ethical Algorithm Michael Kearns and Aaron Roth

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend