Algorithmic Bias Machine Learning An area of AI that studies how to - - PDF document

▶

Feb 11, 2024 278 likes •322 views

Algorithmic Bias Machine Learning An area of AI that studies how to get computers to learn from experience (e.g. data) Identify patterns from a training dataset Then generalize from these patterns and apply it to future data (that is

SLIDE 1

Algorithmic Bias

Machine Learning  An area of AI that studies how to get computers to learn from experience (e.g. data)  Identify patterns from a training dataset  Then generalize from these patterns and apply it to future data (that is different). This is called a test dataset  Supervised learning

Features -> Classifier -> Class Label
Features: traits of a data instance (e.g. keywords in

your email) that are informative as to the classification

Class Label: the classification (e.g. Personal or School

for email sorting)

Produce the classifier by training on the training data

Algorithmic Bias  What is it?

Bias introduced to machine learning due to the training

data

Garbage in / Garbage out: machine learning algorithms

reflect societal bias when applied to biased data (bias in the form of discrimination, prejudice and unfairness)

Do machine learning algorithms respect protected

variables?

These are characteristics that anti-discrimination laws

protect in certain situations

E.g. Fair housing act prevents landlords from

discrimination based on 7 protected classes:  Race

SLIDE 2

 Gender  Religion  Disability  Color  National Origin  Family status

Can’t just ignore features that correspond to these

protected variables and say your algorithm is not biased  Due to confounding factors e.g. Zip code and race are closely correlated in many parts of the US

What are causes of algorithmic bias?
Biased training data (e.g. biased class labels)
Inclusion of protected variables as features;

inclusion of variables correlated with protected variables are highly problematic

Downstream goals (e.g. business profitability)

might conflict with discrimination

Misunderstanding / misuse of machine learning

 Machine learning applied to the wrong tasks

Domain adaptation: machine learning algorithm

trained on data from one distribution but applied to test data from another distribution

Missing / corrupted data
Sampling selection bias
How do you fix this problem?
Not sure if you can:

 A lot of these are societal problems

SLIDE 3

 Can you correct the bias without introducing bias of a different sort?

Understand the problem so that you can use the

right machine learning algorithm  Know when NOT to use a particular algorithm

Make systems that are auditable.
Have less high-impact outcomes earlier on,

especially when an algorithm is involved

Difficult problems that make a solution hard
Limited to what data you actually have. What

about the data you don’t have?

Definitions of fairness vary greatly – which one do

you use?

Lack of social context: can’t transfer a machine