Expert Systems Spring 2017 Fernando Icaza | Robert Smith | Nick Jang - - PowerPoint PPT Presentation

expert systems spring 2017
SMART_READER_LITE
LIVE PREVIEW

Expert Systems Spring 2017 Fernando Icaza | Robert Smith | Nick Jang - - PowerPoint PPT Presentation

Expert Systems Spring 2017 Fernando Icaza | Robert Smith | Nick Jang Clayton Lawrence | Ryan He | Sandeep Bethapudi Overview Introduction Objectives Implementation Demo & Testing What we learned Future of the project Introductions


slide-1
SLIDE 1

Expert Systems Spring 2017

Fernando Icaza | Robert Smith | Nick Jang Clayton Lawrence | Ryan He | Sandeep Bethapudi

slide-2
SLIDE 2

Overview

Introduction Objectives Implementation Demo & Testing What we learned Future of the project

slide-3
SLIDE 3

Robert Smith Clayton Lawrence Fernando Icaza Nick Jang Ryan He Sandeep Bethapudi

Introductions

slide-4
SLIDE 4

Objective

Given a student best possible question during their final review in ITS Integrate model into existing ITS system

slide-5
SLIDE 5

How questions are chosen (Score vs Probability vs Rank)

A question's score is just a measure of how well the student did the last time they attempted this question. It is 0 if the student last answered incorrectly or skipped, 1 if it was answered correctly, and it defaults to the global average score otherwise.

slide-6
SLIDE 6

Implementation

A continuation from last semester’s bayesLearning project During Fall 2016 our team wrote the training algorithms, developed in Python3 and using TensorFlow libraries Early this semester we explored the possibility of implementing Keras Web library, running on top of Tensorflow Being a general Neural Network/AI framework, Keras allows the project to be more portable.

slide-7
SLIDE 7

Underlying Architecture and Setup

All of our code was created and tested in a virtual environment created with python’s built-in ‘venv’ module. We designed our code in our ITS-provided Ubuntu VMs to ensure compatibility with the ITS servers. Our code requires python3+Keras to create and train the neural networks, but once in production it only requires Node.js, which is called from a PHP script we have included in the ITS repo.

slide-8
SLIDE 8

Training Process Architecture

models/ database.py getStudentData.py studentData.py simulator.py trainScript.py

# functions that reads Student and Question info from the MySQL database # Sanitizes questions and removes unnecessary entries # Uses PCA to reduce the dimensionality of the data # uses backend (Keras) to compute probabilities of question attempts # triggers simulator.py and saves results to h5 & json

  • bjects
slide-9
SLIDE 9

First idea: The probability

A higher NN output means a higher chance that the student will answer the question correctly. The job of keras.js is simply to compute the probability for each question in the input list, and return the results.

slide-10
SLIDE 10

Second idea: The rank

Once Keras.js returns the probability list to keras.php, keras.php will have to decide the "rank" of each question. For example, we probably don't want to give the student a question that he/she has a 99% chance of getting right, because it probably won't be challenging. However, maybe we'll decide that the optimal probability is 75-80%. We think a student will learn the most from a question if it is somewhat difficult, but not too difficult.

slide-11
SLIDE 11

Expert System Workflow

slide-12
SLIDE 12

Inputs and Outputs

The task we had to solve was this: How can we predict if a student will be able to answer a given question? Fortunately, ITS comes with “Expert Knowledge” about our problem domain. Questions have already been grouped into categories called ‘tags’, and we can use this information to build a smarter model. In order to determine if a student can answer a given question, we first check that student’s answers on all similar questions in the database. These become our neural network inputs.

slide-13
SLIDE 13

Results

There are 284 questions that are currently being given to students during assignments. When training, the model tried to guess the students answers on a scale of 0-1, where a 0 signifies an incorrect answer and 1 signifies a correct answer. After training our model on all attempts on these 284 questions, we achieved an average mean squared error of 0.04, which means that our average error was 0.2

slide-14
SLIDE 14

Challenges

  • Many students have answered very few questions (data restrictions)
  • Our information about the domain space isn’t perfect
  • New students every semester - can’t build long-term knowledge of specific

students

slide-15
SLIDE 15

What We Learned

Modular programming using Python, MySQL and NodeJS Neural Networks/AI Principal Components Analysis PHP

slide-16
SLIDE 16

Questions?