Use of graphs and taxonomic classifications to analyze content - - PowerPoint PPT Presentation

use of graphs and taxonomic classifications to analyze
SMART_READER_LITE
LIVE PREVIEW

Use of graphs and taxonomic classifications to analyze content - - PowerPoint PPT Presentation

Institute of Computing UNICAMP Use of graphs and taxonomic classifications to analyze content relationships among courseware Mrcio de Carvalho Saraiva and Claudia Bauzer Medeiros Background and Motivation Videos Slides 2 Background and


slide-1
SLIDE 1

Use of graphs and taxonomic classifications to analyze content relationships among courseware

Márcio de Carvalho Saraiva and Claudia Bauzer Medeiros

Institute of Computing UNICAMP

slide-2
SLIDE 2

Background and Motivation

Slides

2

Videos

slide-3
SLIDE 3

Background and Motivation

3

slide-4
SLIDE 4

Background and Motivation

More than 1600 items about "databases"

Changuel et al., 2015

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

Background and Motivation

It should be easy to understand how different materials are related.

Ouyang and Zhu, 2007

6

Relationships:

  • Authorship
  • Date
  • Location
  • Visual
  • Topics
  • etc.

? ? ? ? ?

slide-7
SLIDE 7

Related Work

7

Educational Data Mining

(Pereira, 2014)

Recognition

  • f data relationships

(Sathiyamurthy et al. 2012)

Analysis of relationships using graph databases

(Cavoto et al. 2015)

Integration

  • f multimedia data

(Santanchè et al. 2014)

Objects metadata Architecture with hierarchies

  • Analysis on a single level
  • not related to education
  • ne kind of data
  • semantic annotations
  • training sets
slide-8
SLIDE 8

Goal

Allow the integration of different types of educational material, highlighting relationships among content.

8

slide-9
SLIDE 9

Proposal

CIMAL

I'm having trouble on "Big Data" in discipline "X" of teacher "Y" what

  • ther material could help me to

understand this issue?

Sources 1 to N Student

CIMAL: Courseware Integration under Multiple relations to Assist Learning

9

slide-10
SLIDE 10

Proposal

10

Step B - Intermediate

Representation Instantiation

Step C - Intermediate Representation Analysis Step D - Courseware access Step A - Extraction of elements of interest Extractor DDEx

Java + Youtube API

input courseware elements

  • f interest
slide-11
SLIDE 11

Proposal - Step A - Extraction of elements of interest

11

Classification Algorithms Introduction to Databases

slide-12
SLIDE 12

Proposal - Step A - Extraction of elements of interest

12

Commented slide, highlighted concepts, Slide titles, Descriptions from figures and tables ....

slide-13
SLIDE 13

13

Proposal - Step A - Extraction of elements of interest

Data Science Data Mining Classification

slide-14
SLIDE 14

14

Proposal - Step A - Extraction of elements of interest

0:00- 0:30

“...Databases are important...”

0:31- 1:00

“...everybody need to know SQL...”

1:01- 1:30

“...the DBMS is a computer software application...”

slide-15
SLIDE 15

Shadows as graphs Builder

15

Metadata and Text Extractor

input Graph-based Representation courseware

Step A - Extraction of elements of interest Step B - Intermediate

Representation Instantiation

Shadows as graphs

elements of interest

Step D - Courseware access Step C - Intermediate Representation Analysis

Intermediate Graph Representation Builder

Proposal - Step B - Intermediate Representation Instantiation

Extractor

slide-16
SLIDE 16

Author

Discipline

Text Date Set of relevant concepts

16

Mota and Medeiros, 2013

Proposal - Step B - Intermediate Representation Instantiation

Introduction to Databases (video)

  • Prof. Saraiva

Advanced Databases

Lorem ipsum dolor sit amet, onsectetur adipiscing elit...

10/11/2015 SQL Databases DBMS

Coursewar e

slide-17
SLIDE 17

17

Metadata and Text Extractor

Classifier

Information about Relations Graph-based Representation

Intermediate Graph Representation Builder

input Graph-based Representation Classification

  • f

Representations courseware

Relationships Analyzer

Combiner

Enriched Taxonomy Topics external sources Taxonomy

Step A - Extraction of elements of interest Step B - Intermediate

Representation Instantiation

Step C - Intermediate Representation Analysis

Java + Lucene APIs Graph Database (Neo4J) Shadows as graphs

Classification

  • f Shadows

elements of interest

Step D - Courseware access

Proposal - Step C - Intermediate Representation Analysis

Extractor

slide-18
SLIDE 18

18 The ACM Computing Classification System (CCS)

A B C D

General and reference Hardware Theory of computation Information systems

1 2 3

Information retrieval Data management systems

1 2 3

Query languages Middleware for databases Information integration World Wide Web

Proposal - Step C - Intermediate Representation Analysis

slide-19
SLIDE 19

19 The ACM Computing Classification System (CCS)

A B C D

General and reference Hardware Theory of computation Information systems

1 2 3

Information retrieval Data management systems World Wide Web

1 2 3

Query languages Middleware for databases Information integration

Proposal - Step C - Intermediate Representation Analysis

slide-20
SLIDE 20

20

Metadata and Text Extractor

Classifier

Information about Relations Graph-based Representation

Intermediate Graph Representation Builder

input Graph-based Representation Classification

  • f

Representations courseware

Relationships Analyzer

Combiner

Enriched Taxonomy Topics external sources Taxonomy

Step A - Extraction of elements of interest Step B - Intermediate

Representation Instantiation

Step C - Intermediate Representation Analysis

Java + Lucene APIs Graph Database (Neo4J) Shadows as graphs

Classification

  • f Shadows

elements of interest

Step D - Courseware access

Proposal - Step C - Intermediate Representation Analysis

Extractor

slide-21
SLIDE 21

21 The ACM Computing Classification System (CCS)

A B C D

General and reference Hardware Theory of computation Information systems

1 2 3

Information retrieval Data management systems

1 2 3

Query languages Middleware for databases Information integration World Wide Web

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

  • Prof. Saraiva

Advanced Databases

Lorem ipsum dolor sit amet, onsectetur adipiscing elit...

10/11/2015 SQL,

Database,

DBMS...

Topics???

slide-22
SLIDE 22

22 Introduction to Databases (video)

N wikipages

Proposal - Step C - Intermediate Representation Analysis

slide-23
SLIDE 23

ESA 80% SQL 20% Depth-first search

Gabrilovich and Markovitch, 2007 ; Apache Lucene, 2014

Proposal - Step C - Intermediate Representation Analysis

23 Introduction to Databases (video)

slide-24
SLIDE 24

24

ESA 80% SQL

Query languages

Proposal - Step C - Intermediate Representation Analysis

Courseware

slide-25
SLIDE 25

25

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

  • Prof. Saraiva

Advanced Databases

Lorem ipsum dolor sit amet, onsectetur adipiscing elit...

10/11/2015 SQL, Database, DBMS... Topics Information Systems

Data management systems

Query languages

slide-26
SLIDE 26

26

Metadata and Text Extractor

Classifier

Information about Relations Graph-based Representation

Intermediate Graph Representation Builder

input elements of interest Graph-based Representation Classification

  • f

Representations courseware

Relationships Analyzer

Combiner

Enriched Taxonomy Topics external sources Taxonomy

Step A - Extraction of elements of interest Step B - Intermediate

Representation Instantiation

Step C - Intermediate Representation Analysis

Java + Lucene APIs Graph Database (Neo4J) Shadows as graphs

Classification

  • f Shadows

Step D - Courseware access

Proposal - Step C - Intermediate Representation Analysis

Extractor

slide-27
SLIDE 27

27

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

Information Systems

Data management systems

Query languages

Classificatio n Algorithms (slides)

Information Integration

slide-28
SLIDE 28

28

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

Information Systems

Data management systems

Query languages

Classificatio n Algorithms (slides)

Information Integration

slide-29
SLIDE 29

29

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

Information Systems

Data management systems

Query languages

Classificatio n Algorithms (slides)

Information Integration

slide-30
SLIDE 30

30

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

Information Systems

Data management systems

Query languages

Databases I (video)

slide-31
SLIDE 31

31

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases (video)

Information Systems

Data management systems

Query languages

Databases I (video)

slide-32
SLIDE 32

32

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases Classificatio n Algorithms Databases I Data Mining

slide-33
SLIDE 33

33

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases Classification Algorithms Databases I Data Mining

Clique

slide-34
SLIDE 34

34

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases Classification Algorithms Databases I Data Mining

3 2 2 1

Shortest Path Graph to “Data Mining”

slide-35
SLIDE 35

35

Proposal - Step C - Intermediate Representation Analysis

Introduction to Databases Classification Algorithms Databases I Data Mining

Centrality

slide-36
SLIDE 36

Proposal

Step A - Extraction of elements of interest

Extractor

Classifier

Graph builder

Information about Relations Graph-based Representation

Intermediate Graph Representation Builder

input elements of interest Graph-based Representation Classification

  • f

Representations courseware

Relationships Analyzer

Graph

Interface

query

Combiner

Enriched Taxonomy Topics external sources Taxonomy

Step B - Intermediate

Representation Instantiation

Step C - Intermediate Representation Analysis Step D - Courseware access

Graph Database (Neo4J)

Java + 2graph API

Graph-based representations, informations about relations and classification

  • utput

Java + Lucene APIs Graph Database (Neo4J)

DDEx Shadows

36

slide-37
SLIDE 37

Preliminary conclusions

Expected contributions:

  • A framework for integration of different courseware highlighting

relationships among topics;

○ It is not necessary tags and training sets;

  • Analysis of multilevels relationships through graphs and taxonomy;
  • Adaptation of the algorithm ESA to classification of topics of

courseware using intrinsic features

37

slide-38
SLIDE 38

References

  • Changuel, S., Labroche, N., and Bouchon-Meunier, B. (2015). Resources sequencing using automatic prerequisite–outcome annotation.

ACM Trans. Intell. Syst. Technol.,6(1):pages 6:1–6:30.

  • Gabrilovich, E. and Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In

Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI’07, pages 1606–1611, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

  • Mishra, S., Gorai, A., Oberoi, T., and Ghosh, H. (2010). Efficient Visualization of Content and Contextual Information of an Online

Multimedia Digital Library for Effective Browsing. 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pages 257–260.

  • Mota, M. S. and Medeiros, C. B. (2013). Introducing shadows: Flexible document representation and annotation on the web. ICDE

Workshops, pages 13–18.

  • Ouyang, Y. and Zhu, M. (2007). eLORM: Learning object relationship mining based repository. Proceedings - The 9th IEEE International

Conference on E-Commerce Technology; The 4th IEEE International Conference on Enterprise Computing, E-Commerce and E-Services, CEC/EEE 2007, pages 691–698. 38

slide-39
SLIDE 39

References

  • Pereira, B. (2014). Entity Linking with Multiple Knowledge Bases: An Ontology Modularization Approach. In The Semantic Web - ISWC

2014, pages 513–520. Springer International Publishing.

  • Santanchè, A., Longo, J. S. C., Jomier, G., Zam, M., and Medeiros, C. B. (2014). Multifocus research and geospatial data -

anthropocentric concerns. JIDM - Journal of Information and Data Management, 5(2):146–160.

  • Sathiyamurthy, K., Geetha, T. V., and Senthilvelan, M. (2012). An approach towards dynamic assembling of learning objects. In

Proceedings of the International Conference on Advances in Computing, Communications and Informatics, ICACCI ’12, pages 1193–1198, New York, NY, USA. ACM.

  • Shirakawa, M., Nakayama, K., Hara, T., and Nishio, S. (2015). Wikipedia-based semantic similarity measurements for noisy short texts

using extended naive bayes. IEEE Trans. Emerging Topics Comput., 3(2):205–219.

  • Tong, Y., Cao, C. C., Zhang, C. J., Li, Y., and Chen, L. (2014). CrowdCleaner: Data cleaning for multi-version data on the web via
  • crowdsourcing. 2014 IEEE 30th International Conference on Data Engineering, pages 1182–1185.

39

slide-40
SLIDE 40

Acknowledgements

  • Laboratory of Information Systems - Unicamp
  • Work partially financed by CAPES, FAPESP/Cepid in Computational

Engineering and Sciences (2013/08293-7), FAPESP-PRONEX (eScience project), INCT in Web Science (CNPq 557.128/2009-9), and individual grants from CAPES and CNPq.

40

slide-41
SLIDE 41

Thanks!

Use of graphs and taxonomic classifications to analyze content relationships among courseware

marcio.saraiva@ic.unicamp.br

Institute of Computing UNICAMP