COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC - - PowerPoint PPT Presentation

comparison of categorical properties offered by multiple
SMART_READER_LITE
LIVE PREVIEW

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC - - PowerPoint PPT Presentation

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS Using automated Web Crawler in Python with Scrapy Bachelor Thesis - Introduction Presentation Louis Mbuyu Aufgabensteller: Prof. Dr. Franois Bry Betreuer: Prof. Dr.


slide-1
SLIDE 1

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS

Using automated Web Crawler in Python with Scrapy

Bachelor Thesis - Introduction Presentation Louis Mbuyu Aufgabensteller: Prof. Dr. François Bry Betreuer: Prof. Dr. François Bry, Yingding Wang

1

25.01.18

slide-2
SLIDE 2
  • 1. Motivation
  • 2. Research topic
  • 3. Project Plan
  • 4. Technical Details
  • 5. Challenges
  • 6. Demo

2

AGENDA

slide-3
SLIDE 3

3

  • 1. Motivation
slide-4
SLIDE 4

4

  • Irom - Intelligent Recommender Of MOOCs
  • MOOC - Massive Open Online Course
  • To improve the learning and studying at the university.
  • To develop an intelligent MOOCs search engine

The goal of Irom

  • Define unified categorical set across all MOOC

platforms. My Motivation

Motivation

slide-5
SLIDE 5

5

Link: https://irom.pms.ifi.lmu.de/#/home

Motivation

slide-6
SLIDE 6

6

Massive - Unlimited learners Open - No requirements Online - Open access via the web Course - Filmed lectures/Videos, Readings

MOOC (Massive Open Online Course) Motivation - MOOC

slide-7
SLIDE 7

7

Motivation - Popular MOOC platforms

slide-8
SLIDE 8

8

Udemy

  • ca. 40.000

FutureLearn

  • ca. 1.000

Open2Study

  • ca. 2.000

Edx

  • ca. 2.000

Udacity

  • ca. 200

Coursera

  • ca. 5.000

Motivation - MOOC platforms by size

slide-9
SLIDE 9

9

Motivation - Diverse categories

slide-10
SLIDE 10

10

Unified categorical set across all the platforms to allow

users to browse through the categories on Irom

Motivation - Behind my research question

slide-11
SLIDE 11

11

  • Browse & create new subcategories e.g.: “Top Courses”

Motivation - Advantages

slide-12
SLIDE 12

12

  • Easier to recommend similar courses

Motivation - Advantages

slide-13
SLIDE 13

13

  • 2. Research Topic
slide-14
SLIDE 14

14

  • Define unified MOOC Model.
  • Web crawl 6 platforms and extract ca. 40.000 courses.
  • Unified Categorical set across all platforms

Research Topic COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS

  • Tasks:
slide-15
SLIDE 15

15

Research Topic - Unified MOOC Model { “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String, … }

slide-16
SLIDE 16

16

Research Topic

{ “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String } { “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String } { “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String } { “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String }

Data Science

[ Data, Science, Machine, Learning, Python, …, n ]

Courses Most n occurring words course 1 course 2 course 3 course m

slide-17
SLIDE 17

17

Research Topic

Data Science

[ Data, Science, Machine, Learning, Python, …, n ]

Most n occurring words

New course from Udacity

[ Data, Science, Machine, Driving, Car, …, n ]

Most n occurring in description words Compare

slide-18
SLIDE 18

18

  • 3. Project Plan
slide-19
SLIDE 19

19

Build web crawlers

Dec 17 Jan 18 Feb 18 12 Mar 18

Analyse of categories Compare & Evaluate Deadline

Crawler for 6 platforms. Define unified MOOC model. Remove stop words. most occurring words. Categorical properties. Define unified categorical set.

Project Plan - Timeline

slide-20
SLIDE 20

20

  • 4. Technical details
slide-21
SLIDE 21

21

Technical Details

slide-22
SLIDE 22

22

  • 5. Challenges
slide-23
SLIDE 23

23

  • Web crawling websites with Javascript
  • Defining unified MOOC model
  • Websites changing their layout

Challenges

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

  • 6. Quick Demo of web crawling
slide-27
SLIDE 27

27