Mapping, And Analyzing Complex Data Using Multilayer Networks (MLNs) - - PowerPoint PPT Presentation

mapping and analyzing complex data using
SMART_READER_LITE
LIVE PREVIEW

Mapping, And Analyzing Complex Data Using Multilayer Networks (MLNs) - - PowerPoint PPT Presentation

EERMLN: EER Approach For Modeling, Mapping, And Analyzing Complex Data Using Multilayer Networks (MLNs) Kanthi Komar 1 , Abhishek Santra 2 , Sanjukta Bhowmick 3 and Sharma Chakravarthy 4 1,2,4 Information Technology Laboratory, CSE Department,


slide-1
SLIDE 1

EER→MLN: EER Approach For Modeling, Mapping, And Analyzing Complex Data Using Multilayer Networks (MLNs)

Kanthi Komar1, Abhishek Santra2, Sanjukta Bhowmick3 and Sharma Chakravarthy4

1,2,4Information Technology Laboratory, CSE Department,

University of Texas at Arlington, Arlington, Texas, USA

3CSE Department, University of North Texas, Denton, Texas, USA

Email: 1kanthisannappa.komar@mavs.uta.edu, 2abhishek.santra@mavs.uta.edu,

3sanjukta.bhowmick@unt.edu, 4sharmac@cse.uta.edu

slide-2
SLIDE 2

Complex Data Analysis: Application Categories

ER 2020 3-Nov-20

Highly rated actor groups working in similar genres but have not co-acted together in any movie? For the most popular collaborators in each conference, the most active 3- year period(s)? Best city to hold conferences of authors to maximize attendance?

Actors Movies Genres Ratings Publications Conferences Years Flight Routes Author Residence Author Collaborations

Same Entities Different Entities Same & Different Entities

Author Collaborations

Multiple Relationships Multiple Relationships Multiple Relationships

Author Friendship

slide-3
SLIDE 3

Big Complex Data Analytics Flow Chart

ER 2020 3-Nov-20

Application Requirements

Data Set Description Analysis Objectives

Data Model

Multilayer Network Model (HoMLN, HeMLN, HyMLN)

Analysis

Efficient Divide-and-Conquer based Decoupling Approach

Final Results

Drill-Down Analysis

ICCS 2017 ICDM 2017 BDA ’17 ’18 ’19 CICLing 2019

Difficult, Error Prone Not Extensible

EER → MLN Approach

ER 2020

Future Collaborations Spread of Covid-19 in US

Novel 8 Step Algorithm Precise and Unambiguous Aids in Drill-down Analysis

slide-4
SLIDE 4

Data Model: Multilayer Networks (Overview)

➢ A multilayer network MLN(G, X) is,

▪ G = Set of Simple Graphs

− Gi(Vi, Ei) represents ith layer

▪ X = Set of Bipartite Graphs between layers

− Xi,j(Vi, Vj, Li,j): for Gi, Gj ; Li,j: Set of Inter-layer Edges

➢ Homogeneous MLN (HoMLN)

▪ Modeling interactions among same set of entities ▪ Vi = Vj , Implicit inter-layer edges

➢ Heterogeneous MLN (HeMLN)

▪ Modeling interactions among different sets of entities ▪ Vi ≠ Vj , Explicit inter-layer edges

ER 2020 3-Nov-20

slide-5
SLIDE 5

EER Model → MLN Model: The 8 Step Algorithm

ER 2020 3-Nov-20

Research Paper Publication Data Set (DBLP) Modeling

Recursive Binary Relationship Non-Recursive Binary Relationship Relationship Name = Intra/Inter Edge Label Key Attribute = Node Label Min Max Cardinality = Degree Information

Heterogeneous MLN (HeMLN)

Remaining Entity/Relationship Attributes stored in Relations for Drill-Down Analysis

slide-6
SLIDE 6

EER Model → MLN Model: The 8 Step Algorithm

ER 2020 3-Nov-20

Research Paper Publication Data Set (DBLP) Modeling

Relations obtained as by product used for Drill-Down Analysis

Name Institution

Author

Author1Name Author1Name

Collaborates-with

ID Name PublishYearID

Paper

Paper1ID Paper2ID

Same-Conference

PaperID Keyword

Keywords

ID ReviewPaper Score

Review

Review1ID Review2ID

Same-Score

ID

Year

Year1ID Year2ID

Same-Range

AuthorID YearID

Active-in

AuthorID PaperID

Writes

slide-7
SLIDE 7

EER Model → MLN Model: The 8 Step Algorithm

ER 2020 3-Nov-20

Actor Interaction Data Set (IMDb) Modeling

Homogeneous MLN (HoMLN)

Relations obtained as by product used for Drill-Down Analysis

Name State Country

Actor

Actor1Name Actor2Name

Acts-with

Actor1Name Actor2Name Type

Similar-Genre_TYPE

Actor1Name Actor2Name Val_Range

Similar-AverageRating

TYPE VAL_RANGE

slide-8
SLIDE 8

EER Model → MLN Model: The 8 Step Algorithm

ER 2020 3-Nov-20

Author-City Interaction Data Set Modeling

Hybrid MLN (HyMLN)

Relations obtained as by product used for Drill-Down Analysis

Name Institution ResidenceCODE

Author

Author1Name Author2Name

Friends-with

IATA CODE Name

City

City1Code CIty2Code Carrier

Flight-Connects_CARRIER

Author1Name Author2Name

Collaborates-with

slide-9
SLIDE 9

Analysis Method: Decoupling Approach

Divide and Conquer Approach: Analysis function-specific partial (or intermediate) results composed systematically to fulfill objective

ER 2020 3-Nov-20 Partial Results 1 Partial Results 2 Partial Results 3

Combined Results of Layer 1 and 2

FINAL RESULT (Combined Results of Layer 1, 2 and 3) Combine Layer 2 Partitions

Ψ

(Analysis Function)

Communities, Hubs, Subgraphs

Θ1

Multilayer Network

Θ2

Θ

(Composition Function)

Boolean Composition (HoMLN), Matching (HeMLN)

slide-10
SLIDE 10

Specification Mapping: Objective → MLN Expression

ER 2020 3-Nov-20

Highly rated actor groups working in similar genres but have not co-acted together in any movie? For the most popular collaborators in each conference, the most active 3-year period(s)? Best city to hold conferences of authors to maximize attendance?

Objective

HoMLN: Acts-with, Similar-Genre, Similar-AverageRating

HeMLN: Author, Year, Paper, Review HyMLN: City, Au-Collaborates-with, Au-Friends-with

MLN Expression

Ψ Θ

NOT(Acts-with) Θ Similar-Genre Θ Similar-AverageRating Community Detection Boolean AND Composition MLN Expression

Ψ Θ

Paper Θ Author Θ Year Community Detection Maximal Weighted Matching MLN Expression

Ψ Θ

Au-Collaborates-with Θ Au-Friends-with Θ City Community (Author), Degree Centrality (City) MLN-Searching

Mapping

slide-11
SLIDE 11

Drill-Down Analysis: Potential Actor Collaborations

ER 2020 3-Nov-20

Highly rated actors working in similar genres but have not co-acted together in any movie

Validating Fact: In 2017, talks of casting Johnny Depp and Tom Cruise in pivotal roles in Universal Studios' cinematic universe titled Dark Universe

Actor/Actresses Prominent Genres Willem Dafoe, Russell Crowe Action, Crime Hilary Swank, Kate Winslet Drama Tom Hanks, Reese Witherspoon, Cameron Diaz Comedy, Romance Johnny Depp, Tom Cruise Adventure, Action Leonardo DiCaprio, Ryan Gosling Crime, Romance Nicolas Cage, Antonio Banderas Action, Thriller Hugh Grant, Kate Hudson, Emma Stone Comedy, Romance

#Vertices #Edges in L1 #Edges in L2 #Edges in L3 IMDb HoMLN (For top 500 actors then repopulated with co-actors) 9,485 (Actors) 45,581 (Acts-with) 13,945,912 (Similar-Genre) 996,527 (Similar-AverageRating)

slide-12
SLIDE 12

Drill-Down Analysis: Research Activity Insights

ER 2020 3-Nov-20

For the most popular collaborators in each conference, the most active 3-year period(s)

DBLP HeMLN Author Paper Year Number of Nodes 16,918 10,326 18 Number of Edges 2,483 12,044,080 18

Validating Facts Most popular researchers active in different periods SIGMOD: Srikanth Kandula (15188 citations) VLDB: Divyakant Agrawal (23727 citations) ICDM: Shuicheng Yan (52294 citations)

slide-13
SLIDE 13

Conclusions

➢ Proposed a novel 8-step algorithm for MLN modeling

▪ Leveraged EER modeling ▪ Makes the process error-free, precise and unambiguous ▪ Aids in drill-down analysis of final results

➢ Demonstrated the applicability on real-world applications ➢ Current Work: Approach being used for the analysis and visualization of spread of Covid-19 across US counties

ER 2020 3-Nov-20

slide-14
SLIDE 14

Questions?

Abhishek Santra

abhishek.santra@mavs.uta.edu

Sanjukta Bhowmick

sanjukta.bhowmick@unt.edu

Sharma Chakravarthy

sharmac@cse.uta.edu

For more information visit: https:// ://itla itlab.uta.ed .uta.edu/ u/

3-Nov-20 ER 2020

Kanthi Komar

kanthisannappa.komar@mavs.uta.edu

Project Funded by:

Covid-19 Analysis with MLNs