fast item response theory irt analysis by using gpus
play

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen - PowerPoint PPT Presentation

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo Silicon Valley AI Lab 1 Outline A brief introduction of Item Response Theory (IRT) Edward, a new probabilistic programming (PP) toolkit


  1. Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo Silicon Valley AI Lab 1

  2. Outline • A brief introduction of Item Response Theory (IRT) • Edward, a new probabilistic programming (PP) toolkit • An experiment of using Edward to do IRT model estimation on both CPU and GPU computing platforms • Summary 2

  3. A concise introduction of adaptive learning • What's up with adaptive learning 3

  4. Adaptive learning is hot in the eduTech market • Increasing demands • Districts’ spending on adaptive learning products has grown threefold between 2013 and 2016 , according to a new analysis. EdWeek market brief 7/14/2017 • Increasing suppliers 4

  5. Precisely knowing students ability levels is important • Adaptive learning needs correct inputs about students’ ability levels, which are latent • Assessment are developed for inferring latent abilities • For a Yes/No question, the probability a student provides a correct answer p(X=1) depends on • his/her latent ability (theta) • Also other related factors, e.g., item’s di ffi culty, making a lucky guess, carelessness … 5

  6. Item Response Theory (IRT) • IRT provides a principled statistical method to quantify these factors and has been widely used to build up modern assessment industry • A widely used 2 parameter logistic model (2-PL) 6

  7. IRT with fewer or more parameters • 1-PL • Only having b, assume all items share same a • 3-PL • c for random guessing • 4-PL • d for inattention 7

  8. IRT’s wide usages 8

  9. IRT’s wide usages • More precise description of item performance 8

  10. IRT’s wide usages • More precise description of item performance • More precise scoring 8

  11. IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly 8

  12. IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly • Supporting advanced linking & equating to make standard tests be possible 8

  13. IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly • Supporting advanced linking & equating to make standard tests be possible • Supporting adaptive testing by placing examinees and items on the same scale 8

  14. Concrete examples • “ Item response theory and computerized adaptive testing ” presentation made for a hands-on workshop by Rust, Cek, Sun, and Kosinski from University of Cambridge The Psychometrics Center • Very nice animations to explain IRT, how to use IRT to score, and CAT. 9

  15. Item Response Function Binary items Probability of getting item right 1 Parameters: Models: Measured concept (theta) 10

  16. Item Response Function Binary items Probability of getting item right 1 Parameters: Difficulty • Models: Difficulty 1 Parameter • Measured concept (theta) 10

  17. Item Response Function Binary items Probability of getting ) e item right 1 p o l s ( n Parameters: o i t Difficulty • a n Discrimination • i m i r c s i D Models: Difficulty 1 Parameter • 2 Parameter • Measured concept (theta) 10

  18. Item Response Function Binary items Probability of getting ) e item right 1 p o l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing Measured concept (theta) 10

  19. Item Response Function Binary items Probability of getting ) e item right 1 p o Inattention l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c Inattention • s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing 4 Parameter • Measured concept (theta) 10

  20. Item Response Function Binary items Probability of getting ) e item right 1 p o Inattention l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c Inattention • s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing 4 Parameter • unfolding • Measured concept (theta) 10

  21. Scoring Test: 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

  22. Scoring Test: 1. Normal distribution 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

  23. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

  24. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

  25. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

  26. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

  27. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

  28. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

  29. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

  30. Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

  31. Computer Adaptive Testing • Standard tests • Containing fixed number of questions • Some are too simple and some are too di ffi cult for a specific test-taker • CAT • Items can be tailored • Save time/money • Measure test-taker’s ability more accurately 12

  32. Example of CAT Start the test: 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

  33. Example of CAT Start the test: Incorrect response Correct response 1. Ask first question, e.g. of 1.0 medium difficulty 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

  34. Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

  35. Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability Normal distribution 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

  36. Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

  37. Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 4. Select next item with a 0.6 difficulty around the most Difficulty likely score (or with the max 0.4 information) 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

  38. Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 4. Select next item with a 0.6 difficulty around the most Difficulty likely score (or with the max 0.4 information) 5. And so on…. Until the 0.2 stopping rule is reached 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

  39. IRT model estimation • Mostly used Marginal Maximum Likelihood (MMLE) • Finding the marginal distribution of the item parameters by integrating over theta • Estimate item parameters by MLE • Obtain theta by MLE based on estimated item parameters • For a more e ffi cient estimation, use EM • Other ways • Joint Maximum Likelihood (JML) 14

  40. Bayesian solution • Issues with MLE • Depends on distribution of data • Estimation is not accurate when samples are small- sized • Hard to handle ability distribution is not normal • Bayesian solutions consider theta priors 15

  41. MCMC • Markov chain Monte Carlo (MCMC) used for Bayesian estimation • Ultimate goal is approximate p(parameters|data) by sampling many data points from the posterior probability • Hamiltonian MC is good at dealing with high-dimensional parameter spaces. HMC utilizes the geometry of the important regions of the posterior for making better proposals. 16

  42. Variational Inference • To approximate intractable distribution by using a family of distributions and finding the member of this family that can minimizes divergence to the true posterior • By approximating the posterior with a simpler function, leading to faster estimation • Kullback–Leibler (K-L) divergence was frequently used to measure two distributions’ closeness 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend