Crowford Crowd Investment Data Portal Group 13 LABBE, Kevin Patrick - - PowerPoint PPT Presentation

crowford
SMART_READER_LITE
LIVE PREVIEW

Crowford Crowd Investment Data Portal Group 13 LABBE, Kevin Patrick - - PowerPoint PPT Presentation

Crowford Crowd Investment Data Portal Group 13 LABBE, Kevin Patrick Joseph MARTYNAVA, Karina THOMPSON, Julien Edward Topics Crowdfunding basics Schema Mapping / ER (Gathering Data) Data Fusion (Data Analysis) Data Portal


slide-1
SLIDE 1

Crowford

Crowd Investment Data Portal

Group 13

LABBE, Kevin Patrick Joseph MARTYNAVA, Karina THOMPSON, Julien Edward

slide-2
SLIDE 2

Topics

  • Crowdfunding basics
  • Schema Mapping / ER (Gathering Data)
  • Data Fusion (Data Analysis)
  • Data Portal
slide-3
SLIDE 3

Crowdfunding

  • Crowdfunding
  • Schema mapping / ER
  • Data fusion
  • Data Portal
slide-4
SLIDE 4

Crowdfunding

  • Fund a project by a large number of people
  • Start-up, Video games, charity…
  • Crowdfunding
  • Schema
  • Data fus
  • Data Po
slide-5
SLIDE 5

Crowdfunding

  • over 2 billions $
  • 100K projects
  • 10 milion contributors
  • Crowdfunding
  • Schema
  • Data fus
  • Data Po
slide-6
SLIDE 6

Crowford

  • Gather projects from different sources
  • Predict if a project will be successful or not
  • Crowdfunding
  • Schema
  • Data fus
  • Data Po
slide-7
SLIDE 7

Schema Mapping / ER

  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal

Gathering data

slide-8
SLIDE 8
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal

Data Sources

slide-9
SLIDE 9
  • Same structure
  • Same theme (fund projects)
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal

Data Sources

slide-10
SLIDE 10

Crowdfund data

Project Idea (Pen, Video game, Product…) Packages / Perks What you get / Money Author(s)

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

Crawling

  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-15
SLIDE 15

Crawling

  • Use JavaScript to generate project page
  • Private API that generate JSON
  • Python script w/ http2 that generate requests
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-16
SLIDE 16

https://www.indiegogo.com/private_api/explore? filter_funding=&filter_percent_funded=&filter_q uick=new&filter_status=&pg_num=2

slide-17
SLIDE 17

Crawling

  • Have to extract data from websites
  • 2 Spider bots (crawlers) using Scrapy
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-18
SLIDE 18

Crawling w/ Scrapy

  • Python Framework for extracting data
  • Write Spider (crawling bots)
  • Parse data and extract with xpath
  • Export data (Schema mapping)
slide-19
SLIDE 19
slide-20
SLIDE 20
  • Initialize the spider
slide-21
SLIDE 21
  • Download and extract data
  • Export the item
slide-22
SLIDE 22
  • Export the item
slide-23
SLIDE 23

Data scheme

Author Project Perks / Packages

  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-24
SLIDE 24

Project Author Project Summary Perks / Packages

  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal

Data scheme

slide-25
SLIDE 25

Data scheme

  • Project / Author
  • Project / Perks
  • Recommendation : Project / Related_Project
  • ER for multiple authors / perks / projects…
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-26
SLIDE 26

Results

  • Our working set
  • 36 000 projects
  • 65 000 authors
  • over 230 000 perks
  • Crowdfunding
  • Schema mapping / ER
  • Data Fusion
  • Data Portal
slide-27
SLIDE 27

Data fusion

  • Crowdfunding
  • Schema mapping /
  • Data Fusion
  • Data Portal

And other data analysis

slide-28
SLIDE 28

Recommendation

  • Use buzzwords in project description
  • Use n-grams (word combination)
  • Similarity measures using Pairwise metrics
  • Linear kernels
  • Can be used for data fusion
  • Crowdfu
  • Schema
  • Data Fusion
  • Data Por
slide-29
SLIDE 29

Success Prediction

  • How much money has been collected
  • How much time
  • The average pledge
  • Crowdfu
  • Schema
  • Data Fusion
  • Data Por
slide-30
SLIDE 30

Success Prediction

Random Forests Logistic regression RESULT 1 RESULT 2 COMPARISON RESULT

  • Crowdfu
  • Schema
  • Data Fusion
  • Data Por
slide-31
SLIDE 31

Data Portal

  • Crowdfunding
  • Schema mapping / ER
  • Data fusion
  • Data Portal
slide-32
SLIDE 32

Goal

Browse successful projects

  • Crowdfunding
  • Schema mappin
  • Data fusion
  • Data Portal
slide-33
SLIDE 33

DataBase

PostgreSQL

slide-34
SLIDE 34

DataBase

PostgreSQL

Web Interface Django

slide-35
SLIDE 35

DataBase

PostgreSQL

Web Interface Django Project List Project Info

slide-36
SLIDE 36

Web Interface

  • Project list
  • Filter
  • Access project page
  • Allows you to download the datasets
  • Crowdfunding
  • Schema mappin
  • Data fusion
  • Data Portal
slide-37
SLIDE 37

Demo

slide-38
SLIDE 38