Opportunity versus Challenge: Exploring Usage of Log-File and - - PowerPoint PPT Presentation

opportunity versus challenge exploring usage of log file
SMART_READER_LITE
LIVE PREVIEW

Opportunity versus Challenge: Exploring Usage of Log-File and - - PowerPoint PPT Presentation

Opportunity versus Challenge: Exploring Usage of Log-File and Process Data in International Large Scale Assessments Assessment in the age of Data Science: the case of interactive items tested in France Reinaldo Dos Santos DEPP, Ministry of


slide-1
SLIDE 1

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

Opportunity versus Challenge: Exploring Usage of Log-File and Process Data in International Large Scale Assessments Assessment in the age of Data Science: the case of interactive items tested in France Reinaldo Dos Santos DEPP, Ministry of Education, France

slide-2
SLIDE 2

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

ASSESSMENTS IN FRANCE

Rome, 6-7 June 2019

slide-3
SLIDE 3

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

ASSESSMENTS IN FRANCE

DIGITAL TRANSITION

■Middle and high schools :

■ Online platform TAO

■Elementary schools :

■ Offline app on tablets

slide-4
SLIDE 4

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

OPPORTUNITIES

INNOVATIVE ASSESSMENTS ■Examples of interactive items

ChatBot Physics Maths Coding

slide-5
SLIDE 5

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

5

DIGITAL ITEMS IN MATHS: EVALUATE« WITH » AND « THROUGH »

■Evaluate « through » digital assessments : to place the student in a digital environment for the assessment.

■Computer, tablet, …

■Evaluate « with » digital assessments : to give the student the opportunity to use the technology in order to solve problems.

■Calculators, dynamic geometry, interactive items…

■References:

■Stacey & Wiliam (2013). ■Drijvers (soon).

slide-6
SLIDE 6

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

THE CONCEPT OF FUNCTION IN GRADE 9 :

slide-7
SLIDE 7

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

INTERACTION, USE, ADAPTATION

■Qualitative knowledge

  • f

the concept

  • f

function, through the simulation of a real life situation. ■Two possible approaches

■Operational approach : « input/output » The function is understood as a sequence of values ■Structural approach : graphical perception The function is understood as a math object with properties

■References: Sfard (1991), Drijvers (2012)

slide-8
SLIDE 8

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

COMPLEX DATA

  • Unstructured (text, images,

tweets…) or partially structured (JSON, XML…) data.

  • logs recorded as JSON files
  • 15-20 seconds of interacting with the

item create a JSON file with 10 000 lines

 « Big Data » needed!

slide-9
SLIDE 9

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

BIG DATA ARCHITECTURE

■Hadoop & Spark frameworks ■ Hadoop – free and opensource framework, designed to deal with huge volumes of data, in a distributed environment.

slide-10
SLIDE 10

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

10

Model OK?

WORKFLOW

slide-11
SLIDE 11

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

11

DATA PREPARATION

  • Combine the score of the

student with the logs

  • Creation of new variables :
  • Distance between the

first input and the target

  • First input in a range

around the target (300-500)

slide-12
SLIDE 12

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

12

The duration alone doesn’t discriminate between success or failure at the item

CLASSICAL ANALYSIS

slide-13
SLIDE 13

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

13

Still no criteria able to discriminate the populations

CLASSICAL ANALYSIS

slide-14
SLIDE 14

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

14

BIG DATA + MACHINE LEARNING =

slide-15
SLIDE 15

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

15

MACHINE LEARNING: A TOOLBOX

slide-16
SLIDE 16

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

UNSUPERVISED LEARNING: CLUSTERING

Data partitioning or unsupervised classification consists in splitting a population into homogenic groups. It is often used without prior hypothesis.

slide-17
SLIDE 17

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

17

DBSCAN

■DBSCAN tends to regroup in the same clusters points that are in the « neighbourhood » of other points

  • f the same cluster.
  • Simple algorithm
  • Systematically convergent
  • Doesn’t need a prior definition of

the number of clusters Bad if the clusters’ densities are too different

slide-18
SLIDE 18

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

18

Clear separation between the « success » cluster and the « failure » cluster No explanatory variable other than the score (indeed)

DBSCAN

Can we predict the score depending on

  • ther variables?
slide-19
SLIDE 19

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

SUPERVISED LEARNING: A CLASSIFICATION

Supervised learning fits with already labelled data (here, the student’s score at the item). We are trying to predict the label of each individual, through the use of a classification model. Therefore, we split the population between a training sample, with which we will build our model, and a test sample, on which we will test the model’s solidity.

slide-20
SLIDE 20

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

RANDOM FORESTS

Random forests are the generalization of the decision tree algorithm. The main problem with the decision tree is that the final tree is strongly dependent on the

  • rder

in which the variables are picked (tree vs leaf).

slide-21
SLIDE 21

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

RANDOM FORESTS

  • For each tree, we pick a random sample of variables.
  • Each tree is independently trained.
  • The forest is built through the majority vote of each tree.
slide-22
SLIDE 22

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

22

Model unable to correctly predict the score with these variables.

RANDOM FORESTS

Feature engineering needed!

slide-23
SLIDE 23

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

23

Addition of new variables :

  • Gap between the second

input and the target

  • Gap between the last

input and the target

  • Standard deviation of the

input

FEATURE ENGINEERING

Didactic analysis: In a structural approach, students should graphically detect a narrow zone around the intersection of the curbs, and focus one’s tries into this target zone.

slide-24
SLIDE 24

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

24

The new variables make the random forests more predictive.

EVOLUTION OF THE MODEL

slide-25
SLIDE 25

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

25

correct = 0 correct = 1 Identification of new subpopulations

CLUSTERING (DBSCAN)

slide-26
SLIDE 26

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

26

DBSCAN => plenty of outliers

CLUSTERING (KMEANS)

Confirmation through k-means clustering

slide-27
SLIDE 27

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

27

INTERPRETATION OF THE RESULTS

Structural approach, failure Operational approach, failure Structural approach, success Operational approach, success

slide-28
SLIDE 28

ASSESSMENT IN THE AGE OF DATA SCIENCE: THE CASE OF INTERACTIVE ITEMS TESTED IN FRANCE OPPORTUNITY VERSUS CHALLENGE: EXPLORING USAGE OF LOG-FILE AND PROCESS DATA IN INTERNATIONAL LARGE SCALE ASSESSMENTS

THANK YOU!!!