comparison of categorical properties offered by multiple
play

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC - PowerPoint PPT Presentation

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS Using automated Web Crawler in Python with Scrapy Bachelor Thesis - Introduction Presentation Louis Mbuyu Aufgabensteller: Prof. Dr. Franois Bry Betreuer: Prof. Dr.


  1. COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS Using automated Web Crawler in Python with Scrapy Bachelor Thesis - Introduction Presentation Louis Mbuyu Aufgabensteller: Prof. Dr. François Bry Betreuer: Prof. Dr. François Bry, Yingding Wang 25.01.18 1

  2. AGENDA 1. Motivation 2. Research topic 3. Project Plan 4. Technical Details 5. Challenges 6. Demo 2

  3. 1. Motivation 3

  4. Motivation • Irom - I ntelligent R ecommender O f M OOCs • MOOC - M assive O pen O nline C ourse The goal of Irom • To improve the learning and studying at the university. • To develop an intelligent MOOCs search engine My Motivation • Define unified categorical set across all MOOC platforms. 4

  5. Motivation Link: https://irom.pms.ifi.lmu.de/#/home 5

  6. Motivation - MOOC MOOC (Massive Open Online Course) M assive - Unlimited learners O pen - No requirements O nline - Open access via the web C ourse - Filmed lectures/Videos, Readings 6

  7. Motivation - Popular MOOC platforms 7

  8. Motivation - MOOC platforms by size Coursera FutureLearn ca. 5.000 ca. 1.000 Udacity ca. 200 Udemy Edx ca. 40.000 Open2Study ca. 2.000 ca. 2.000 8

  9. Motivation - Diverse categories 9

  10. Motivation - Behind my research question Unified categorical set across all the platforms to allow users to browse through the categories on Irom 10

  11. Motivation - Advantages • Browse & create new subcategories e.g.: “Top Courses” 11

  12. Motivation - Advantages • Easier to recommend similar courses 12

  13. 2. Research Topic 13

  14. Research Topic COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS • Tasks: • Define unified MOOC Model. • Web crawl 6 platforms and extract ca. 40.000 courses. • Unified Categorical set across all platforms 14

  15. Research Topic - Unified MOOC Model { “title”: String, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, “category”: String, … } 15

  16. Research Topic { course 1 “title”: String, Courses “courseUrl”: String, Data Science “imageUrl”: String, “description”: String, “duration”: Int, “category”: String } { course 2 “title”: String, Most n occurring words “courseUrl”: String, “imageUrl”: String, “description”: String, [ Data, “duration”: Int, “category”: String } Science, { course 3 “title”: String, Machine, “courseUrl”: String, “imageUrl”: String, “description”: String, “duration”: Int, Learning, “category”: String } Python, { course m “title”: String, “courseUrl”: String, “imageUrl”: String, …, “description”: String, “duration”: Int, “category”: String n ] } 16

  17. Research Topic New course from Udacity Data Science Most n occurring in Most n occurring words description words [ Data , [ Data , Compare Science , Science , Machine , Machine , Learning, Driving, Python, Car, …, …, n ] n ] 17

  18. 3. Project Plan 18

  19. Project Plan - Timeline Build web Analyse of Compare & Deadline crawlers categories Evaluate Categorical Crawler for 6 Remove stop properties. platforms. words. Define unified Define unified most occurring categorical MOOC model. words. set. Dec 17 Jan 18 Feb 18 12 Mar 18 19

  20. 4. Technical details 20

  21. Technical Details 21

  22. 5. Challenges 22

  23. Challenges • Web crawling websites with Javascript • Defining unified MOOC model • Websites changing their layout 23

  24. 24

  25. 25

  26. 6. Quick Demo of web crawling 26

  27. 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend