1
Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data - - PowerPoint PPT Presentation
Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data - - PowerPoint PPT Presentation
Leveraging Graphs for Better AI Alicia M. Frame, PhD Lead Data Scientist, neo4j alicia.frame@neo4j.com 1 Graph Data Science Applications Financial Services Drug Discovery Recommendations Customer Segmentation Cybersecurity Churn
Financial Services Drug Discovery Recommendations Cybersecurity Predictive Maintenance Customer Segmentation Churn Prediction Search/MDM
Graph Data Science Applications
- Current data science models ignore network structure
- Graphs add highly predictive features to existing ML models
- Otherwise unattainable predictions based on relationships
Novel & More Accurate Predictions
with the Data You Already Have
Machine Learning Pipeline
“The idea is that graph networks are bigger than any one machine-learning approach. Graphs bring an ability to generalize about structure that the individual neural nets don't have.” "Where do the graphs come from that graph networks operate
- ver?”
Building a Graph ML Model
Data Sources Native Graph Platform Machine Learning
Aggregate Disparate Data and Cleanse Build Predictive Models Unify Graphs and Engineer Features
Parquet JSON
and more…
MLlib
and more…
Spark Graph Native Graph Platform Machine Learning
Example: Spark & Neo4j Workflow
Graph Transactions Graph Analytics
Cypher 9 in Spark 3.0 to create non-persistent graphs MLlib to Train Models Native Graph Algorithms, Processing, and Storage
Explore Graphs Build Graph Solutions
- Massively scalable
- Powerful data pipelining
- Robust ML Libraries
- Non-persistent, non-native graphs
- Persistent, dynamic graphs
- Graph native query and algorithm
performance
- Constantly growing list of graph
algorithms and embeddings
The Steps of Graph Data Science
Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks
Enterprise Maturity Data Science Complexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence
Steps Forward in Graph Data Science
Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks
Enterprise Maturity Data Science Complexity
14
Query based knowledge graphs:
Connecting the Dots at NASA
“Using Neo4j someone from our Orion project found information from the Apollo project that prevented an issue, saving well over two years of work and one million dollars of taxpayer funds.”
Steps Forward in Graph Data Science
Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks Query Based Feature Engineering
Enterprise Maturity Data Science Complexity
Churn prediction research has found that simple hand- engineered features are highly predictive
- How many calls/texts has
an account made?
- How many of their
contacts have churned?
Query-Based Feature Engineering
Telecom-churn prediction Telecommunication networks are easily represented as graphs
21
Query-Based Feature Engineering
Telecom-churn prediction Add connected features based
- n graph queries to tabular data
Khan et al, 2015
Spark Graph Native Graph Platform Machine Learning
- Merge distributed data into
DataFrames
- Reshape your tables
into graphs
- Explore cypher queries
- Move to Neo4j to build
expert queries
- Persist your graph
Knowledge Graphs:
Getting Started Example with Spark
- Bring query based graph
features to ML pipeline
Graph Transactions Graph Analytics
Steps Forward in Graph Data Science
Query Based Feature Engineering Graph Embeddings Graph Neural Networks Query Based Knowledge Graph Graph Algorithm Feature Engineering
Enterprise Maturity Data Science Complexity
Feature Engineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics.
Graph Feature Engineering
Add More Descriptive Features:
- Influence
- Relationships
- Communities
25
Graph Feature Categories & Algorithms
Pathfinding & Search
Finds the optimal paths or evaluates route availability and quality
Centrality / Importance
Determines the importance of distinct nodes in the network
Community Detection
Detects group clustering or partition options
Heuristic Link Prediction
Estimates the likelihood of nodes forming a relationship Evaluates how alike nodes are
Similarity Embeddings
Learned representations
- f connectivity or topology
- Connected components to identify
disjointed graphs sharing identifiers
- PageRank to measure influence and
transaction volumes
- Louvain to identify communities that
frequently interact
- Jaccard to measure account similarity
based on relationships
26
Financial Crime: Detecting Fraud
Large financial institutions already have existing pipelines to identify fraud via heuristics and models Graph based features improve accuracy:
+142,000 Peer Reviewed Publications Graph Fraud / Anomaly Detection
in the last 10 years
Spark Graph Native Graph Platform Machine Learning
- Merge distributed data into
DataFrames
- Reshape your tables
into graphs
- Explore cypher queries and
simple algorithms
- Persist your graph
- Create rule based features
- Run native graph
algorithms and write to graph or stream
Graph Feature Engineering:
Getting Started Example with Spark
- Bring graph features to ML
pipeline for training
Graph Transactions Graph Analytics
29
Graph Algorithms in Neo4J
- Parallel Breadth First Search
- Parallel Depth First Search
- Shortest Path
- Single-Source Shortest Path
- All Pairs Shortest Path
- Minimum Spanning Tree
- A* Shortest Path
- Yen’s K Shortest Path
- K-Spanning Tree (MST)
- Random Walk
- Degree Centrality
- Closeness Centrality
- CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
- Betweenness Centrality
- Approximate Betweenness Centrality
- PageRank
- Personalized PageRank
- ArticleRank
- Eigenvector Centrality
- Triangle Count
- Clustering Coefficients
- Connected Components (Union Find)
- Strongly Connected Components
- Label Propagation
- Louvain Modularity – 1 Step & Multi-Step
- Balanced Triad (identification)
- Euclidean Distance
- Cosine Similarity
- Jaccard Similarity
- Overlap Similarity
- Pearson Similarity
Pathfinding & Search Centrality / Importance Community Detection Similarity
neo4j.com/docs/
graph-algorithms/current/ Link Prediction
- Adamic Adar
- Common Neighbors
- Preferential Attachment
- Resource Allocations
- Same Community
- Total Neighbors
Steps Forward in Graph Data Science
Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Neural Networks Query Based Feature Engineering Graph Embeddings
Enterprise Maturity Data Science Complexity
Embedding transforms graphs into a vector, or set of vectors, describing topology, connectivity, or attributes of nodes and edges in the graph
31
Graph Embeddings
- Vertex embeddings: describe connectivity of each node
- Path embeddings: traversals across the graph
- Graph embeddings: encode an entire graph into a single vector
Explainable Reasoning over Knowledge Graphs for Recommendation
32
Graph Embeddings - Recommendations
33
Graph Embeddings - Recommendations
Explainable Reasoning over Knowledge Graphs for Recommendation
Spark Graph Native Graph Platform Machine Learning
- Merge distributed data into
DataFrames
- Reshape your tables
into graphs
- Explore cypher queries and
simple algorithms
- Move to Neo4j to build
expert queries
- Write to persist
- Stay tuned for DeepWalk
and DeepGL algorithms
Graph Feature Engineering:
Getting Started Example with Spark
- Bring graph features to ML
pipeline for training
Graph Transactions Graph Analytics
Steps Forward in Graph Data Science
Query Based Knowledge Graph Graph Algorithm Feature Engineering Query Based Feature Engineering Graph Neural Networks Graph Embeddings
Enterprise Maturity Data Science Complexity
Deep Learning refers to training multi-layer neural networks using gradient descent
36
Graph Native Learning
Graph Native Learning refers to deep learning models that take a graph as an input, performs computations, and return a graph
37
Graph Native Learning
Battaglia et al, 2018
Example: electron path prediction
Bradshaw et al, 2019
38
Graph Native Learning
Given reactants and reagents, what will the products be? Given reactants and reagents, what will the products be?
Example: electron path prediction
39
Graph Native Learning
Progressing in Graph Data Science
Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks
Enterprise Maturity Data Science Complexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence
Resources
Business
- neo4j.com/use-cases/
artificial-intelligence-analytics/
Data Scientists/Developers
- neo4j.com/sandbox
- neo4j.com/developer/
- community.neo4j.com
alicia.frame@neo4j.com @aliciaframe1
neo4j.com/ graph-algorithms-book
1. Which of the following is not a step in graph data science?
a. Building a knowledge graph b. Using graph algorithms for feature engineering c. Using Kafka for transactional messaging
2. Louvain is an example of which type of algorithm?
a. Centrality b. Community Detection c. Pathfinding
3. Fill in the blank!
Explore in Spark, _________ in Neo4j
43