Ioannis Caragiannis University of Patras Joint work with George - - PowerPoint PPT Presentation

▶

Oct 15, 2023 115 likes •391 views

Ioannis Caragiannis University of Patras Joint work with George Krimpas and Alexandros Voudouris massive : available to a large number of people (16-18 million students) online : through the internet/web open : no cost for the

SLIDE 1

Ioannis Caragiannis University of Patras

Joint work with George Krimpas and Alexandros Voudouris

SLIDE 2

 massive: available to a large number of people

(16-18 million students)

 online: through the internet/web  open: no cost for the students  courses: series of lectures on a subject

SLIDE 3

 www.edx.org  www.coursera.org  www.udacity.com  > 100 employees each  business model: verified certificates, head-

hunting (connecting students to industry), specializations, corporate collaborations

SLIDE 4

 400+ universities  2400+ courses  22 out of the top-25 US universities  3000+ instructors  TAs, video assistants  13 languages (80% english, 8.5% spanish,

french, chinese)

 subjects: humanities, computer science,

business & management

SLIDE 5

SLIDE 6

 Daphne Koller, Andrew Ng (Coursera founders):

“… courses in the humanities and social sciences - in

which the material is more open to interpretation - have proven more complicated to translate into an

nline format, especially when it came to the

assessment and grading of the students.”

SLIDE 7

 What? Should result in quantitative information

successfully completed her class, achieved a 9/10

(A+), ranked in the top 1% of her class of 100,000, etc

 Why? Information in the verified certificate,

important for employers (new revenue source)

 Who? Experts (graders, TAs) are costly  A common solution: automatic grading

(multiple choice questions)

SLIDE 8

 Highly unsatisfactory when evaluating the

students’ ability of

proving a mathematical statement
expressing their critical thinking over an issue
demonstrating their creative writing skills

 In these cases, assessment and grading is a

human computation task

 Alternative solution: peer grading

outsource the grading task to the students

SLIDE 9

 How does it work?

each student grades some of the other students’

assignments (as part of her own assignment)

 Allowing the students to grade using cardinal

scores is risky:

not experienced in assessing their peers’

performance in absolute terms

have strong incentives to assign low scores

 Solution: ordinal peer grading

SLIDE 10

 Cardinal peer grading

Piech, Huang, Chen, Do, Ng, & Koller (2013)
Kulkarni, Wei, Le, Chia, Papadopoulos, Cheng, Koller,

& Klemmer (2013)

Walsh (2014)
de Alfaro & Shavlovsky (2014)
www.crowdgrader.org

 Ordinal peer grading

Raman & Joachims (2014)
Shah, Bradley, Parekh, Wainwright, & Ramachandran

(2014)

SLIDE 11

 n students (exam papers)  Distributing the exam papers: each student

gets k<<n exam papers to grade so that each exam paper is given to k students

 Grading: each student ranks the exam papers

assigned to her

 Rank aggregation: compute a global ranking

from the partial ranks

 Goal: to come up with a global ranking that is

“as correct as possible”

SLIDE 12

 Similarities:

on input a profile of rankings, compute a final full

ranking

 Differences:

each student is simultaneously an alternative and a

voter

voters do not have to rank all alternatives
the alternatives to be ranked are decided externally

SLIDE 13

 (n,k)-bundle graph: k-regular bipartite graph

G=(U,V,E) with |U|=|V|=n

 U: exam papers (randomly assigned to nodes)  V: graders  Edge (u,v) with u in U and v in V indicates that

exam paper u will be given to student v

 Warning! Nodes corresponding to a grader and

her exam paper should not be connected

SLIDE 14

 The students participate in the exam and submit

their papers

 Scenario I:

the instructor announces indicative solutions and

grading instructions

the students use this info when grading

 Scenario II:

no info by the instructor
students’ grading performance is similar to their

performance in the exam

SLIDE 15

 Basic assumption: there is a ground truth

ranking of the exam papers

 Perfect grading: each grader ranks the k exam

papers she gets consistently to the ground truth

SLIDE 16

 Quality measure: number of pairs of exam

papers which compare in the global ranking as in the ground truth

.. or total number of pairs minus the kendall-tau

distance

(bad) example: a random permutation recovers

correctly 50% of pairwise relations on average

SLIDE 17

 Find the minimum-degree (n,k)-bundle graph

that guarantees that the whole ground truth is always recovered if perfect grading is used

1 2 3 4 5 6 7 1 2 3 4 5 6 7

graders exam papers k = Θ(n1/2)

SLIDE 18

1 2 3 4 5 6 7 1 2 3 4 5 6 7

graders exam papers k = Θ(n1/2)

Find a minimum-degree diameter-3 bipartite graph

 Find the minimum-degree (n,k)-bundle graph

that guarantees that the whole ground truth is always recovered if perfect grading is used

 Miller and Siran (2013)

SLIDE 19

 Use much simpler bundle graphs

E.g., any k-regular bip. graph for small values of k
even by putting together Kk,k’s
or a k-regular bip. graph not containing a 4-cycle

 Aggregation rules

plurality, approval
Borda
Random serial dictatorship
Markov-chain-based aggregation rules

SLIDE 20

 Each grader gives k+i-1 points to the exam

paper she ranks i-th

 Global ranking is obtained by sorting the exam

papers in terms of non-increasing number of total points (Borda score)

 Ties are broken randomly

SLIDE 21

 Theorem: When Borda is applied on partial

rankings that are consistent to the ground truth, the expected fraction of correctly recovered pairwise relations is at least 1-O(1/k) when the bundle graph is 4-cycle-free and at least 1-O(1/k1/2) in general

SLIDE 22

SLIDE 23

 Students have qualities in [1/2,1]

ability to compare correctly two exam papers

(probability to find the correct outcome)

 Qualities define the ground truth ranking σ*  Grading according to a Mallows noise model

for generating random rankings

each grader of quality p ranks each pair among the k

exam papers she gets as in σ* with prob. p and incorrectly with prob. 1-p

if no ranking is defined, she repeats

 C., Procaccia, & Shah (2013)

SLIDE 24

 Comparison of Borda and RSD in 500 executions

(n = 1000, k = 8)

SLIDE 25

 Theory:

Is a 1-O(1/k2) fraction (or better) possible? Upper

bounds?

Analysis for noisy grading?
Impact of incentives?

 Practice:

Which is the most realistic noise model for grading?
How do the methods considered perform in practice

(with real students)?

SLIDE 26

2 4 6 8 10 12 14 2 4 6 8 10