Revealing Algorithmic Rankers Julia Stoyanovich Gerome Miklau - - PowerPoint PPT Presentation

revealing algorithmic rankers
SMART_READER_LITE
LIVE PREVIEW

Revealing Algorithmic Rankers Julia Stoyanovich Gerome Miklau - - PowerPoint PPT Presentation

Revealing Algorithmic Rankers Julia Stoyanovich Gerome Miklau Ellen P. Goodman Drexel University UMass Amherst Rutgers Law School Schlo Dagstuhl July 17-22, 2016 Algorithmic rankers Input : database of items (colleges, cars, individuals,


slide-1
SLIDE 1

Revealing Algorithmic Rankers

Julia Stoyanovich Drexel University Gerome Miklau UMass Amherst Ellen P. Goodman Rutgers Law School

Schloß Dagstuhl July 17-22, 2016

slide-2
SLIDE 2

Dagstuhl, July 17-22, 2016

Algorithmic rankers

Input: database of items (colleges, cars, individuals, …) Score-based ranker compute the score of each item using a known formula, e.g., monotone aggregation sort items on score Output: permutation of the items (complete or top-k)

2

Do we have transparency? Only syntactically, not actually!

slide-3
SLIDE 3

Dagstuhl, July 17-22, 2016

Opacity in algorithmic rankers

Reason 1: scores are absolute, rankings are relative Is 3 a good score? What about 10? 15?

3

0" 2" 4" 6" 8" 10" 12" 14" 16" 18" 20" 1" 6" 11" 16" 21" 26" 31" 36" 41" 46"

Average'Count'

slide-4
SLIDE 4

Dagstuhl, July 17-22, 2016

Opacity in algorithmic rankers

Reason 2: a ranking may be unstable

4

(a) many tied or nearly-tied items

slide-5
SLIDE 5

Dagstuhl, July 17-22, 2016

Opacity in algorithmic rankers

Reason 2: a ranking may be unstable

5

(b) small changes in weights can trigger significant re-shuffling

slide-6
SLIDE 6

Dagstuhl, July 17-22, 2016

Opacity in algorithmic rankers

Reason 3: the weight of a scoring attribute does not fully determine its influence on the outcome.

6

….

0.2∗ faculty + 0.3∗avg cnt + 0.5∗gre

Given a score function:

slide-7
SLIDE 7

Dagstuhl, July 17-22, 2016

Rankings are not benign!

7

Rankings are not benign. They enshrine very particular ideologies, and, at a time when American higher education is facing a crisis of accessibility and affordability, we have adopted a de-facto standard of college quality that is uninterested in both of those factors. And why? Because a group of magazine analysts in an office building in Washington, D.C., decided twenty years ago to value selectivity over efficacy, to use proxies that scarcely relate to what they’re meant to be proxies for, and to pretend that they can compare a large, diverse, low- cost land-grant university in rural Pennsylvania with a small, expensive, private Jewish university on two campuses in Manhattan.

slide-8
SLIDE 8

Dagstuhl, July 17-22, 2016

Harms of opacity

  • 1. Due process / fairness. The subjects of the ranking

cannot have confidence that their ranking is meaningful

  • r correct, or that they have been treated like similarly

situated subjects - procedural regularity

  • 2. Hidden normative commitments. What factors does

the vendor encode in the scoring ranking process (syntactically)? What are the actual effects of the scoring / ranking process? Is it stable? How was it validated?

8

slide-9
SLIDE 9

Dagstuhl, July 17-22, 2016

Harms of opacity

  • 3. Interpretability. Especially where ranking algorithms are

performing a public function, political legitimacy requires that the public be able to interpret algorithmic outcomes in a meaningful way. Avoid algocracy: the rule by incontestable algorithms.

  • 4. Meta-methodological assessment. Is a ranking / this

ranking appropriate here? Can we use a process if it cannot be explained? Probably yes, for recommending movies; probably not for college admissions.

9

slide-10
SLIDE 10

Dagstuhl, July 17-22, 2016

The possibility of knowing

  • We need transparency!
  • OK, what is transparency anyway?

zero-knowledge proofs, audits, reverse engineering …. but what about explanation?

10

slide-11
SLIDE 11

Dagstuhl, July 17-22, 2016

Transparency stakeholders

  • Entity being ranked, so they can assess their rank,

know how it was produced

  • User consuming ranked results, who may or may not

himself be ranked

  • Vendor, who may seek greater insight into the process

as it is being developed, or could be asked to justify their ranking

  • Competitors of the vendor
  • Auditors and regulators, so they can assess properties
  • f the ranking

11

slide-12
SLIDE 12

Dagstuhl, July 17-22, 2016

A nutritional label!

12

https://images.heb.com/is/image/HEBGrocery/article/nutrition-facts-label.jpg

slide-13
SLIDE 13

Dagstuhl, July 17-22, 2016

Your outcome: rank 75, increase edu to MBA to advance (~ 50 ranks) Top-k: edu: MBA (95%) race: Caucasian (100%) Impact: age (80%), edu (20%) Ingredients

Ranking facts

13

edu: MBA (10%), BS (85%), PhD (2%), Other (3%) race: Caucasian (70%), Asian (20%), Black (10%)

25K 150K 50K

income median you

18 40 25

age median you

39 40

age median

slide-14
SLIDE 14

Dagstuhl, July 17-22, 2016

Transparency questions

  • How important is a particular attribute (or set of attributes) to…
  • the overall ranking?
  • an individual’s ranking?
  • an individual’s inclusion in the top-k?
  • Is the ranking discriminatory w.r.t a protected group of

individuals?

  • Why is individual A ranked higher than individual B?
  • How stable is the ranking, e.g., how sensitive is the output to

small changes in the scoring function or in item attributes?

  • What-if analysis (what if Alice can change an attribute value…)

14

slide-15
SLIDE 15

Dagstuhl, July 17-22, 2016

Explaining ranked output

  • Input
  • feature vectors X = (x1 … xn) describing entities
  • y = scores (or ranks) for each entity
  • Output: explanation of the ranking in terms of features.
  • Approach: learn a scoring function f’ from X and y,

consistent with observed data, explaining the ranking.

15

slide-16
SLIDE 16

Julia Stoyanovich

16

slide-17
SLIDE 17

Dagstuhl, July 17-22, 2016

Example: explaining csrankings.org

  • Input:
  • X = descriptive attributes from US News and NRC
  • y = scores from csrankings.org
  • Compute f’

17

Features

Number of faculty Program size quartile Student-faculty ratio Avg GRE scores Admission rate 6-year graduation rate Total university faculty

Publication

slide-18
SLIDE 18

Dagstuhl, July 17-22, 2016

Weight Features

1.0239 Number of faculty 0.0528 Program size quartile

  • 0.005

Student-faculty ratio 0.0038 Avg GRE scores

  • 0.0018

Admission rate

  • 0.0018

6-year graduation rate

  • 0.000005

Total university faculty

Example: explaining csrankings.org

  • Result:
  • X = descriptive attributes from US News and NRC
  • y = scores from csrankings.org
  • Computed f’

18

Consequence: csrankings.org ranks largely by number of faculty, favoring large departments over smaller ones.

slide-19
SLIDE 19

Dagstuhl, July 17-22, 2016

Conclusions

  • Rankings are ubiquitous and opaque
  • Transparency is crucial
  • Syntactic transparency is insufficient, need

interpretability / explanations

  • Different explanations for different stakeholders

19