Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data - - PowerPoint PPT Presentation

leveraging graphs for better ai
SMART_READER_LITE
LIVE PREVIEW

Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data - - PowerPoint PPT Presentation

Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data Scientist, neo4j alicia.frame@neo4j.com 1 Graph Data Science Applications Financial Services Drug Discovery Recommendations Customer Segmentation Cybersecurity Churn


slide-1
SLIDE 1

1

Leveraging Graphs for Better AI

Alicia M. Frame, PhD Lead Data Scientist, neo4j alicia.frame@neo4j.com

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

Financial Services Drug Discovery Recommendations Cybersecurity Predictive Maintenance Customer Segmentation Churn Prediction Search/MDM

Graph Data Science Applications

slide-6
SLIDE 6
  • Current data science models ignore network structure
  • Graphs add highly predictive features to existing ML models
  • Otherwise unattainable predictions based on relationships

Novel & More Accurate Predictions

with the Data You Already Have

Machine Learning Pipeline

slide-7
SLIDE 7

“The idea is that graph networks are bigger than any one machine-learning approach. Graphs bring an ability to generalize about structure that the individual neural nets don't have.” "Where do the graphs come from that graph networks operate

  • ver?”
slide-8
SLIDE 8

Building a Graph ML Model

Data Sources Native Graph Platform Machine Learning

Aggregate Disparate Data and Cleanse Build Predictive Models Unify Graphs and Engineer Features

Parquet JSON

and more…

MLlib

and more…

slide-9
SLIDE 9

Spark Graph Native Graph Platform Machine Learning

Example: Spark & Neo4j Workflow

Graph Transactions Graph Analytics

Cypher 9 in Spark 3.0 to create non-persistent graphs MLlib to Train Models Native Graph Algorithms, Processing, and Storage

slide-10
SLIDE 10

Explore Graphs Build Graph Solutions

  • Massively scalable
  • Powerful data pipelining
  • Robust ML Libraries
  • Non-persistent, non-native graphs
  • Persistent, dynamic graphs
  • Graph native query and algorithm

performance

  • Constantly growing list of graph

algorithms and embeddings

slide-11
SLIDE 11

The Steps of Graph Data Science

Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks

Enterprise Maturity Data Science Complexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence

slide-12
SLIDE 12

Steps Forward in Graph Data Science

Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks

Enterprise Maturity Data Science Complexity

slide-13
SLIDE 13

14

Query based knowledge graphs:

Connecting the Dots at NASA

“Using Neo4j someone from our Orion project found information from the Apollo project that prevented an issue, saving well over two years of work and one million dollars of taxpayer funds.”

slide-14
SLIDE 14

Steps Forward in Graph Data Science

Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks Query Based Feature Engineering

Enterprise Maturity Data Science Complexity

slide-15
SLIDE 15

Churn prediction research has found that simple hand- engineered features are highly predictive

  • How many calls/texts has

an account made?

  • How many of their

contacts have churned?

Query-Based Feature Engineering

Telecom-churn prediction Telecommunication networks are easily represented as graphs

slide-16
SLIDE 16

21

Query-Based Feature Engineering

Telecom-churn prediction Add connected features based

  • n graph queries to tabular data

Khan et al, 2015

slide-17
SLIDE 17

Spark Graph Native Graph Platform Machine Learning

  • Merge distributed data into

DataFrames

  • Reshape your tables

into graphs

  • Explore cypher queries
  • Move to Neo4j to build

expert queries

  • Persist your graph

Knowledge Graphs:

Getting Started Example with Spark

  • Bring query based graph

features to ML pipeline

Graph Transactions Graph Analytics

slide-18
SLIDE 18

Steps Forward in Graph Data Science

Query Based Feature Engineering Graph Embeddings Graph Neural Networks Query Based Knowledge Graph Graph Algorithm Feature Engineering

Enterprise Maturity Data Science Complexity

slide-19
SLIDE 19

Feature Engineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics.

Graph Feature Engineering

Add More Descriptive Features:

  • Influence
  • Relationships
  • Communities
slide-20
SLIDE 20

25

Graph Feature Categories & Algorithms

Pathfinding & Search

Finds the optimal paths or evaluates route availability and quality

Centrality / Importance

Determines the importance of distinct nodes in the network

Community Detection

Detects group clustering or partition options

Heuristic Link Prediction

Estimates the likelihood of nodes forming a relationship Evaluates how alike nodes are

Similarity Embeddings

Learned representations

  • f connectivity or topology
slide-21
SLIDE 21
  • Connected components to identify

disjointed graphs sharing identifiers

  • PageRank to measure influence and

transaction volumes

  • Louvain to identify communities that

frequently interact

  • Jaccard to measure account similarity

based on relationships

26

Financial Crime: Detecting Fraud

Large financial institutions already have existing pipelines to identify fraud via heuristics and models Graph based features improve accuracy:

slide-22
SLIDE 22

+142,000 Peer Reviewed Publications Graph Fraud / Anomaly Detection

in the last 10 years

slide-23
SLIDE 23

Spark Graph Native Graph Platform Machine Learning

  • Merge distributed data into

DataFrames

  • Reshape your tables

into graphs

  • Explore cypher queries and

simple algorithms

  • Persist your graph
  • Create rule based features
  • Run native graph

algorithms and write to graph or stream

Graph Feature Engineering:

Getting Started Example with Spark

  • Bring graph features to ML

pipeline for training

Graph Transactions Graph Analytics

slide-24
SLIDE 24

29

Graph Algorithms in Neo4J

  • Parallel Breadth First Search
  • Parallel Depth First Search
  • Shortest Path
  • Single-Source Shortest Path
  • All Pairs Shortest Path
  • Minimum Spanning Tree
  • A* Shortest Path
  • Yen’s K Shortest Path
  • K-Spanning Tree (MST)
  • Random Walk
  • Degree Centrality
  • Closeness Centrality
  • CC Variations: Harmonic, Dangalchev,

Wasserman & Faust

  • Betweenness Centrality
  • Approximate Betweenness Centrality
  • PageRank
  • Personalized PageRank
  • ArticleRank
  • Eigenvector Centrality
  • Triangle Count
  • Clustering Coefficients
  • Connected Components (Union Find)
  • Strongly Connected Components
  • Label Propagation
  • Louvain Modularity – 1 Step & Multi-Step
  • Balanced Triad (identification)
  • Euclidean Distance
  • Cosine Similarity
  • Jaccard Similarity
  • Overlap Similarity
  • Pearson Similarity

Pathfinding & Search Centrality / Importance Community Detection Similarity

neo4j.com/docs/

graph-algorithms/current/ Link Prediction

  • Adamic Adar
  • Common Neighbors
  • Preferential Attachment
  • Resource Allocations
  • Same Community
  • Total Neighbors
slide-25
SLIDE 25

Steps Forward in Graph Data Science

Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Neural Networks Query Based Feature Engineering Graph Embeddings

Enterprise Maturity Data Science Complexity

slide-26
SLIDE 26

Embedding transforms graphs into a vector, or set of vectors, describing topology, connectivity, or attributes of nodes and edges in the graph

31

Graph Embeddings

  • Vertex embeddings: describe connectivity of each node
  • Path embeddings: traversals across the graph
  • Graph embeddings: encode an entire graph into a single vector
slide-27
SLIDE 27

Explainable Reasoning over Knowledge Graphs for Recommendation

32

Graph Embeddings - Recommendations

slide-28
SLIDE 28

33

Graph Embeddings - Recommendations

Explainable Reasoning over Knowledge Graphs for Recommendation

slide-29
SLIDE 29

Spark Graph Native Graph Platform Machine Learning

  • Merge distributed data into

DataFrames

  • Reshape your tables

into graphs

  • Explore cypher queries and

simple algorithms

  • Move to Neo4j to build

expert queries

  • Write to persist
  • Stay tuned for DeepWalk

and DeepGL algorithms

Graph Feature Engineering:

Getting Started Example with Spark

  • Bring graph features to ML

pipeline for training

Graph Transactions Graph Analytics

slide-30
SLIDE 30

Steps Forward in Graph Data Science

Query Based Knowledge Graph Graph Algorithm Feature Engineering Query Based Feature Engineering Graph Neural Networks Graph Embeddings

Enterprise Maturity Data Science Complexity

slide-31
SLIDE 31

Deep Learning refers to training multi-layer neural networks using gradient descent

36

Graph Native Learning

slide-32
SLIDE 32

Graph Native Learning refers to deep learning models that take a graph as an input, performs computations, and return a graph

37

Graph Native Learning

Battaglia et al, 2018

slide-33
SLIDE 33

Example: electron path prediction

Bradshaw et al, 2019

38

Graph Native Learning

Given reactants and reagents, what will the products be? Given reactants and reagents, what will the products be?

slide-34
SLIDE 34

Example: electron path prediction

39

Graph Native Learning

slide-35
SLIDE 35

Progressing in Graph Data Science

Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks

Enterprise Maturity Data Science Complexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence

slide-36
SLIDE 36

Resources

Business

  • neo4j.com/use-cases/

artificial-intelligence-analytics/

Data Scientists/Developers

  • neo4j.com/sandbox
  • neo4j.com/developer/
  • community.neo4j.com

alicia.frame@neo4j.com @aliciaframe1

neo4j.com/ graph-algorithms-book

slide-37
SLIDE 37

1. Which of the following is not a step in graph data science?

a. Building a knowledge graph b. Using graph algorithms for feature engineering c. Using Kafka for transactional messaging

2. Louvain is an example of which type of algorithm?

a. Centrality b. Community Detection c. Pathfinding

3. Fill in the blank!

Explore in Spark, _________ in Neo4j

43

Hunger Games!