Kick-off Meeting Thursday, November 5 th , 2015 Piotr Szczurek, - - PowerPoint PPT Presentation

kick off meeting
SMART_READER_LITE
LIVE PREVIEW

Kick-off Meeting Thursday, November 5 th , 2015 Piotr Szczurek, - - PowerPoint PPT Presentation

Kick-off Meeting Thursday, November 5 th , 2015 Piotr Szczurek, Ph.D. Assistant Professor Director of Master of Science in Data Science Lewis University Agenda 1. Introduction: mission and vision 2. Faculty introductions 3. Projects 4. Why


slide-1
SLIDE 1

Kick-off Meeting

Thursday, November 5th, 2015 Piotr Szczurek, Ph.D.

Assistant Professor Director of Master of Science in Data Science Lewis University

slide-2
SLIDE 2

Agenda

  • 1. Introduction: mission and vision
  • 2. Faculty introductions
  • 3. Projects
  • 4. Why and how to join?
  • 5. Next steps
slide-3
SLIDE 3
  • 1. Introduction: Mission and Vision
slide-4
SLIDE 4

What is Data Science?

“Data science is the study of the generalizable extraction of knowledge from data” (Dhar, V. (2013). Data science and prediction. Commun. ACM, 56, 64-73. )

Data Science

Artificial Intelligence Machine Learning Mathematics Statistics Optimization Theory Databases Distributed Computing and Storage Domain- specific Knowledge Software Development Visualization UI Design

slide-5
SLIDE 5

What is Data Science?

Farcaster at English Wikipedia [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

slide-6
SLIDE 6

What is Data Science?

Artificial Intelligence (AI) is the study of designing intelligent agents.

AI

Reasoning Knowledge Representation Planning Learning Natural Language Processing Perception (e.g. computer vision)

slide-7
SLIDE 7

Mission Statement

The mission of the Data Science and Artificial Intelligence Laboratory (DataSAIL) is to help foster the collaboration of students and faculty members to work on data science

  • r artificial intelligence related problems,

which are of importance to the society, the community, or the University.

slide-8
SLIDE 8

Vision

  • Faculty + students working together on interesting

problems in data science or artificial intelligence.

  • Members of DataSAIL will meet regularly to

 propose new projects  discuss existing ones, and  work on solving problems related to the projects.

  • Goals of this meeting:

 invite students to apply  form groups to work on projects

slide-9
SLIDE 9

Vision Biweekly meetings of all DataSAIL participants

  • discuss what everyone is working on
  • present tools/methods/languages
  • share ideas/techniques/data
  • invite guest speakers
  • find collaborators/students to work
  • n individual projects (form project

groups)

Project Group 1 Project Group 3 Project Group 2

slide-10
SLIDE 10

Vision Work on individual projects (within project groups)

  • faculty+students that participate on a given project would

work on it continuously

  • regular group meetings and communication
  • report on progress during biweekly DataSAIL meetings
  • present work in relevant workshops, conferences, or

scientific journals (or Celebration of Scholarship)

slide-11
SLIDE 11
  • 2. Faculty Introductions
slide-12
SLIDE 12

Faculty

  • Dr. Piotr Szczurek

Assistant Professor, Director of MSDS Education

  • Ph.D., Computer Science, University of Illinois at Chicago (UIC), 2012
  • B.S., Computer Science, University of Illinois at Chicago (UIC), 2005

Academic/research Interests

  • Broad: artificial intelligence, machine learning, mobile databases
  • Specific:
  • Intelligent transportation systems
  • Information relevance estimation
  • Applications of machine learning
  • Computer vision problems
  • Gaming AI
slide-13
SLIDE 13

Faculty

  • Dr. Fatih Koksal

Assistant Professor Education

  • Ph.D., Mathematics, Texas Tech University, 2015
  • Ph.D., Computer Engineering, Bogazici University, 2007
  • M.S., Computer Engineering, Bogazici University, 2000
  • B.S., Computer Engineering, Bogazici University, 1998

Academic/research Interests

  • machine learning
  • artificial intelligence
  • fiber optic networking
  • genetics
  • homological algebra of rings
slide-14
SLIDE 14

Faculty

  • Dr. Amanda Harsy

Assistant Professor Education

  • Ph.D., Mathematics, IUPUI, Indianapolis, IN, 2014
  • M.S., Mathematics, IUPUI, Indianapolis, IN, 2011
  • M.A., Mathematics, University of Kentucky, Lexington, KY, 2009
  • B.A., Mathematics, Coaching Certificate, Taylor University, Upland, IN,

2007

Academic/research Interests

  • mathematics education
  • The Scholarship of Teaching and Learning
  • geometric group theory
  • machine learning.
slide-15
SLIDE 15

Faculty

  • Dr. Jason Perry

Assistant Professor Education

  • Ph.D. Computer Science, Rutgers University, 2015
  • M.A. Computer Science, Princeton University, 2004
  • B.S. Computer Science, University of Kentucky, 1999

Academic/research Interests

  • Data-driven security analysis
  • Secure computation protocols
  • Cryptography
  • Natural Language Processing
  • Theory of Programming Languages
slide-16
SLIDE 16

Faculty

Daniel Ayala

Assistant Professor Education

  • Ph. D., Computer Science, University of Illinois at Chicago, exp. 2015
  • M.C.S., Computer Science, University of Illinois at Urbana-Champaign,

2008

  • B.S., Computer Science, University of Puerto Rico - Río Piedras, 2003

Academic/research Interests

  • mobile data management
  • intelligent transportation systems
  • machine learning
slide-17
SLIDE 17

Faculty

  • Dr. Sarah Powers

Assistant Professor Education

  • Ph. D., Immunology, University of Chicago, 2011
  • B.A., Biological Sciences, University of Chicago, 2004

Academic/research Interests

  • transcriptome analysis of human cancers bearing cyclin D3 mutations as

well as structural changes within the protein caused by these mutations

slide-18
SLIDE 18

Faculty

  • Dr. Cindy Howard

Associate Professor Education

  • Ph. D., Computer Science, University of Illinois at Chicago, 2010
  • M.S., Computer Science, Governors State University, 2001
  • B.B.A., Accounting and Information Systems, University of Wisconsin,

1985

Academic/research Interests

  • mobile applications
  • natural language processing
  • intelligent tutoring systems
slide-19
SLIDE 19

Faculty

  • Dr. Ray Klump

Professor and Chair of Computer and Mathematical Sciences Education

  • Ph.D., Electrical Engineering, University of Illinois at Urbana-Champaign,

2000

  • M.S., Electrical Engineering, University of Illinois at Urbana-Champaign,

1995

  • B.S., Electrical Engineering, University of Illinois at Urbana-Champaign,

1993

Academic/research Interests

  • electric power system analysis
  • computational techniques
  • data visualization
  • cyber security of critical infrastructures
slide-20
SLIDE 20
  • 4. Projects
slide-21
SLIDE 21

Projects

Current Projects / Project Ideas

  • Analysis of Microarray Data from Cancers with

Mutations in D-Type Cyclins

  • College Enrollment Prediction
  • Predicting Student Success

Using Machine Learning and Ranking methods

  • Intrusion Detection
  • Pedestrian Flows
slide-22
SLIDE 22

Analysis of Microarray Data from Cancers with Mutations in D-Type Cyclins

  • Dr. Sarah Powers

Biology Department

slide-23
SLIDE 23

What is a microarray data set?

slide-24
SLIDE 24

D-Type Cyclins in Variety of Cancers

Questions:

  • Clustering based
  • n cyclin mutant

vs normal?

  • Clustering based
  • n cancer type?
  • Clustering based
  • n location of

mutation?

  • Clustering based
  • n level of

expression change?

  • Other ways to

group?

Haemato poetic/L ymphoid 23% Lung 9% Urinary Tract 5% Liver 2% Ovary 3% Large Intestine 10% Endomet rium 35% Oesophag us 1% Breast 2% NS 2% Kidney 3% Skin 4% CNS 0% Prostate 0% Pancreas 0% Upper Aerodige stive Tract 1% Thyroid 1%

Ccnd1

Haemato poetic/Ly mphoid 9% Lung 19% Urinary Tract 0% Liver 3% Ovary 0% Large Intestine 14% Endometr ium 17% Oesopha gus 4% Breast 4% NS 0% Kidney 5% Skin 11% CNS 7% Prostate 1% Pancreas 5% Upper Aerodige stive Tract 2% Thyroid 0%

Ccnd2

Haematop

  • etic/Lym

phoid 33% Lung 5% Urinary Tract 6% Liver 5% Ovary 0% Large Intestine 5% Endometri um 13% Oesophagu s 1% Breast 15% NS 0% Kidney 3% Skin 3% CNS 1% Prostate 0% Pancreas 6% Upper Aerodigesti ve Tract 0% Thyroid 3%

Ccnd3

slide-25
SLIDE 25

College Enrollment Prediction

  • Dr. Piotr Szczurek, Daniel Ayala, Dr. Ray Klump

Computer and Mathematical Sciences Department

slide-26
SLIDE 26

SendInfo Inquiry Application PreAccept FinalAccept Registration

College Enrollment Prediction

Data:

  • Student info from marketing campaign (Royall)
  • CampusAnywhere (if student responds)
  • ACT/SAT scores

Goal: we want to know if certain strategies make sense

  • are we targeting the right students?
  • what are the most effective recruitment strategies?
slide-27
SLIDE 27

College Enrollment Prediction

Example Questions  If they send us their ACT test score, are they more likely to enroll?  if they visit campus, they are more likely to enroll.

  • does it matter when they visit?
  • does it matter when they apply?

 Is there a profile that never ever come to Lewis?

  • from certain high schools? with certain GPAs, interests?

 Where should we buy names?  Which name pools should we be paying the most attention to?

slide-28
SLIDE 28

College Enrollment Prediction

PROBLEM/TASKS

  • 1. Data gathering - high school info, location info, major info
  • Some manual searching / querying
  • Making scripts which parse data from web / web services
  • 2. Dealing with missing values
  • Research and experimentation problem
  • Write programs or use existing tools
  • Test performance
slide-29
SLIDE 29

College Enrollment Prediction

PROBLEM/TASKS (cont.)

  • 3. Determining whether a student responds to a campaign
  • 4. Determining most likely status sequence for a student
  • 5. Determining which students that responds end up

enrolling

  • Prediction problem - using machine learning
  • Write programs or use existing tools
  • Testing
  • 6. Examining student types (clustering problem)
  • 7. Finding association rules
slide-30
SLIDE 30

College Enrollment Prediction

PROBLEM/TASKS (cont.)

  • 8. Examining trends in majors (distribution over time; most

likely majors, etc.)

  • Modeling/prediction/regression

Some other potential directions:

  • Can monitoring of social network activity be used to detect

students likely interested in Lewis ???

  • Developing a framework for examining student enrollment
  • Make an application that allows for finding enrollment

estimates, likely majors, finding interested students, etc.

slide-31
SLIDE 31

Predicting Student Success Using Machine Learning and Ranking methods

  • Dr. Amanda Harsy
slide-32
SLIDE 32

General Questions

  • What contributes to a student’s perseverance in their

degree?

  • Can we predict what types of students will succeed in

a particular major?

  • What contributes to gainful employment after

graduation?

  • What type of student is likely to attend and finish

their degree at Lewis?

  • How much impact does a major change in curriculum

have for student success and recruitment?

slide-33
SLIDE 33

Specific Questions

  • Are there specific courses which can predict a student’s

success in the math major?

  • Does commuting distance contribute to success in the math

major?

  • Is there an optimal number of math classes one should

take per semester?

  • How much of an impact do study habits have for a math

major?

  • Does your progression through the major influence your

gpa?

  • Does working while taking classes influence your

success/perseverance in the major?

  • What type of success are we looking for?
  • Exploring different demographics in the major–

male/female, commuter/resident

slide-34
SLIDE 34

Other Ideas

  • Dr. Ray Klump
slide-35
SLIDE 35

Intrusion Detection Data

slide-36
SLIDE 36

Pedestrian Traffic Flows

slide-37
SLIDE 37
  • 4. Why and how to apply?
slide-38
SLIDE 38

Advantages

  • 1. Looks great on resume !!!
  • 2. Gain experience
  • 3. Meet other students who share your passions
  • 4. Learn new knowledge
  • 5. Collaborate with faculty, etc.
  • 6. Can use research for senior seminar project,

undergraduate capstone, independent study, Master's thesis or project, …

slide-39
SLIDE 39

Qualifications for Membership

Students who wish to participate should have

  • At least one relevant course completed (machine

learning, artificial intelligence, data mining, etc.)

  • Some programming experience (python, java, ...)
  • Time to work on projects and meet with group

members Both graduate and undergraduate students are welcome

slide-40
SLIDE 40

How to Apply?

To apply:

  • Send your resume, statement of purpose, and a

letter of recommendation

  • If you want to work on a specific project or with a

specific faculty member - contact them directly

  • Otherwise - contact me - szczurpi@lewisu.edu

If you don't have all qualifications, you may still be able to participate as a provisional member

slide-41
SLIDE 41
  • 5. Next Steps
slide-42
SLIDE 42

Next Steps

  • 1. Talk about current and future project ideas
  • 2. Form groups - talk to professors and apply to join
  • 3. Schedule DataSAIL next meeting