larus-ba.it/neo4j @AgileLARUS
Andrea Santurbano / @santand84
#NODES #2k19
Earth (Milky Road), 10/10/2019
#NODES #2k19 Earth (Milky Road), 10/10/2019 larus-ba.it/neo4j - - PowerPoint PPT Presentation
larus-ba.it/neo4j @AgileLARUS Streaming Graph Data with Kafka Andrea Santurbano / @santand84 #NODES #2k19 Earth (Milky Road), 10/10/2019 larus-ba.it/neo4j @AgileLARUS Agenda Agenda Introduction Partnership Neo4j and Larus What
Earth (Milky Road), 10/10/2019
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
○ Partnership Neo4j and Larus
○ What is Apache Kafla? ○ How we combined Neo4j and Kafla?
○ Real-time Polyglot Persistence with Elastic, Kafla and Neo4j
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Andrea [:WORKS_AT] [:LOVES] [:INTEGRATOR_LEADER_FOR]
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
LARUS BUSINESS AUTOMATION
#1 Solution Partner in Italy since 2013
VENICE [:BASED_IN]
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
2016 Neo4j JDBC Driver 2015 2011 First Spikes in Retail for Articles’ Clustering 2014 2018 Neo4j APOC, ETL, Spark, Zeppelin, Kafla 2019 Kafla commercial, GraphQL
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
A DISTRIBUTED STREAMING PLATFORM
Has three key capabilities:
durable way;
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
HOW IT WORKS?
which records are published.
maintains a partitioned, distributed, persistent log
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
HOW IT’S USED?
Kafla is generally used for two classes of applications:
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Andrea [:AUTHOR_OF] [:CREATOR_OF]
Michael
ENABLES DATA STREAM ON NEO4J
The project is a Neo4j Plugin composed of several parts:
We also have a Kafla Connect Plugin:
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
HOW CAN IT BE USED?
architectures, e.g. to feed microservices or other databases
installations, e.g. from analytics
integrations
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
Change data “what”?
In databases, Change Data Capture (CDC) is a set of software design patterns used to determine (and track) the data that has changed so an action can be taken using the changed data.
Well suited use-cases?
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
How it works?
Each transaction communicates its changes to our event listener:
Those events are sent asynchronously to Kafla, so the commit path should not be influenced by that.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGEST YOUR DATA, WITH YOUR RULES
The sink provides several ways in order to ingest data from Kafla:
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
HOW WE MANAGE BAD DATA
The Neo4j Streams Sink module provide a Dead Letter Queue mechanism that if activated re-route all “bad-data” to a configured topic. What we mean for “bad-data”?
{id: 1, "name": "Andrea", "surname": "Santurbano"}
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
CONSUME/PRODUCE DATA DIRECTLY FROM CYPHER
The Neo4j Streams project comes out with two procedures:
using the underlying configured Producer;
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
WHAT IS KAFKA CONNECT?
In open source component of Apache Kafla, is a framework for connecting Kafla with external systems such as databases, key-value stores, search indexes, and file systems.
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
HOW IT WORKS?
It works exactly in the same way as the Neo4j Sink plugin so you can provide for each topic your own ingestion setup. You can download it from the Confluent HUB!
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
THE POWER OF THE STREAM!
stream transaction events from Neo4j to other systems;
ingest data into Neo4j by providing our own business rules;
PROCEDURES in order to consume/produce data directly from Cypher.
Polyglot workflow with Apache Kafla Connect
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
DEMO CODE:
NEO4J STREAMS REPOSITORY:
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
CONSUME/PRODUCE DATA DIRECTLY FROM CYPHER
Answer here: r.neo4j.com/hunger-games
Earth (Milky Road), 10/10/2019
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA CYPHER TEMPLATE
Configure an import statement for each Kafla topic streams.sink.topic.cypher.<TOPIC>=<CYPHER_STATEMENT> For example: streams.sink.topic.cypher.sales= \ MATCH (c:Customer {id: event.start.id}) \ MATCH (p:Product {id: event.end.id}) \ MERGE (c)-[:PLACED]->(o:Order)-[:FOR]->(p) \ SET o += event.properties
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA CDC EVENT FROM ANOTHER NEO4J INSTANCE
We allow ingesting the data in two ways:
the Neo4j physical ID) streams.sink.topic.cdc.sourceId=<TOPICS_SEPARATED_BY_SEMICOLON>
NODE_KEY) defined in your graph model streams.sink.topic.cdc.schema=<TOPICS_SEPARATED_BY_SEMICOLON>
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA JSON PROJECTION
You can extract nodes and relationships from a JSON by providing a extraction pattern. Each property can be prefixed with:
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA JSON PROJECTION - NODE PATTERN EXTRACTION
Given: {"userId": 1, "name": "Andrea", "surname": "Santurbano", "address": {"city": "Venice", "cap": "30100"}} You can transform it into a node by specifying one of these patterns:
Pattern Result
User:Actor{!userId} or User:Actor{!userId,*} (User:Actor{userId: 1, name: 'Andrea', surname: 'Santurbano', `address.city`: 'Venice', `address.cap`: 30100}) User{!userId, surname} (User:Actor{userId: 1, surname: 'Santurbano'}) User{!userId, surname, address.city} (User:Actor{userId: 1, surname: 'Santurbano', `address.city`: 'Venice'}) User{!userId, -address} (User:Actor{userId: 1, name: 'Andrea', surname: 'Santurbano'})
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA JSON PROJECTION - RELATIONSHIP PATTERN EXTRACTION
Given: {"userId": 1, "productId": 100, "price": 10, "currency": "€", "shippingAddress": {"city": "Venice", cap: "30100"}} You can transform it into a relationship by specifying one of these patterns:
Pattern Result
(User{!userId})-[:BOUGHT]->(Product{!productId}) Or (User{!userId})-[:BOUGHT{price, currency}]->(Product{!productId}) (User{userId: 1})-[:BOUGHT{price: 10, currency: '€', `shippingAddress.city`: 'Venice', `shippingAddress.cap`: 30100}]->(Product{productId: 100}) (User{!userId})-[:BOUGHT{price}]->(Product{!productId}) (User{userId: 1})-[:BOUGHT{price: 10}]->(Product{productId: 100})
LARUS Business Automation Srl Italy’s #1 Neo4j Partner
INGESTION VIA CUD FILE FORMAT
It’s JSON file that represents Graph Entities (Nodes/Relationships) and how to manage them in term
{ "op": "merge", "properties": { "foo": "value", "key": 1 }, "ids": {"key": 1, "otherKey": "foo"}, "labels": ["Foo","Bar"], "type": "node", "detach": true } UNWIND [..., {"op": "merge", "properties": {"foo": "value", "key": 1}, "ids": {"key": 1, "otherKey": "foo"}, "labels": ["Foo","Bar"], "type": "node", "detach": true}, ...] AS event MERGE (n:Foo:Bar {key: event.key, otherKey: event.otherKey}) SET n += event.properties