Designing Cypher
(a graph query language)
Narrated by Tobias Lindaaker, Developer at Neo Technology tobias@neotechnology.com #neo4j,#cypher @thobe
Designing Cypher (a graph query language) Narrated by Tobias - - PowerPoint PPT Presentation
Designing Cypher (a graph query language) Narrated by Tobias Lindaaker, Developer at Neo Technology tobias@neotechnology.com #neo4j,#cypher @thobe Once upon a time ~2001 in the kingdom of Sweden ~2001 there was a DBMS that had some
(a graph query language)
Narrated by Tobias Lindaaker, Developer at Neo Technology tobias@neotechnology.com #neo4j,#cypher @thobe
~2001
~2001
there was a DBMS that had some interesting things going for it…
~2001
This presentation has been given at StrangeLoop, the video is online: youtu.be/l-n8yj6_RgU This is a Stand-Alone Sequel
~2016
The Precursors to Cypher
Embedded Java API HTTP API server model 2001 2006 2010 Custom code deployment 2011 (July) First release
The Origin of Cypher
(query)--[MODELED_AS]--->(drawing) ^ | | | [IMPLEMENTS] [TRANSLATED_TO] | | | v (code)<-[IN_COMMENT_OF]-(ascii art)
The Origin of Cypher
(query)--[MODELED_AS]--->(drawing) ^ | | | [IMPLEMENTS] [TRANSLATED_TO] | | | v (code)<-[IN_COMMENT_OF]-(ascii art)
MATCH (query)-[:MODELED_AS]->(drawing), (code)-[:IMPLEMENTS]->(query), (drawing)-[:TRANSLATED_TO]->(ascii_art) (ascii_art)-[:IN_COMMENT_OF]->(code) WHERE query.id = {query_id} RETURN code.source
The Origin of Cypher
v1: Read Only
START john=node:Person(name="John") MATCH (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) RETURN foaf
July 2011 (neo4j 1.4)
v2: Graph write no update of search structures
START john=node:Person(name="John") MATCH (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) AND NOT (john)-[:RECOMMENDATION]->(foaf) CREATE (john)-[:RECOMMENDATION]->(foaf) RETURN foaf
Oct 2012 (neo4j 1.8)
Neo4j 2.0: labels and proper indexes Cypher “feature complete”
MATCH (john:Person{name:"John"}), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) WHERE NOT (john)-[:KNOWS]-(foaf) MERGE (john)-[:RECOMMENDATION]->(foaf)
(neo4j 2.0) Dec 2013
A brief Cypher overview
Pattern matching: MATCH (n), (a)-[:REL]->(b) + filtering: WHERE n=b AND a.val < n.val Returning results RETURN x.name AS name ORDER BY x.score LIMIT 10 Creating data:
CREATE (p:Person{name:”Tobias”})
Updating data: SET n.name = ”John” Deleting data: REMOVE p.age DELETE n Carrying results from a query into a new MATCH (a)-[:KNOWS]->(b) WITH a, avg(b.age) AS frAge WHERE frAge > 15 MATCH (a)<-[:FOLLOWS]-(c) RETURN c.name OPTIONAL MATCH Match-or-null MERGE Match-or-create with ON MATCH and ON CREATE to perform updates
Multiple versions
Put the version in the query, and support multiple versions
in the Neo4j Server CYPHER 2.0 MATCH (n) RETURN n
Improve based on user feedback
… good, but not as good as we hoped…
photo credits: Columbia Journalism Review
Query caching - amortise cost of
Whole program analysis
MATCH (john:Person{name:"John"}), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) RETURN friend.name
Whole program analysis
MATCH (john:Person{name:"John"}), (john)-[:KNOWS]-(friend)-[:KNOWS]-(foaf) RETURN friend.name
Recent additions: Procedures
CALL dbms.listProcedures() YIELD name, signature, description MATCH (me:Person{name:{myName}}), (me)-[:KNOWS]-()-[:KNOWS]-(foaf) WHERE NOT (me)-[:KNOWS]-(foaf) CALL apoc.load.jdbc('mysql:…', "SELECT * FROM people WHERE id = " + foaf.id) YIELD row RETURN foaf.name, row.address
it isn’t all roses
Semantically different things look similar
MATCH (n:Measurement) RETURN abs(n.value) MATCH (n:Measurement) RETURN avg(n.value) MATCH (a:Foo{key:{a}}), (b:Foo{key:{b}}), p=(a)-[:BAR*1..]-(b) WHERE all(n in nodes(p) WHERE n.value > {m}) RETURN length(p)
Constructs with intricate semantics
Deprecated: MATCH (a:Foo{key:{a}}), (b:Foo{key:{b}}) CREATE UNIQUE (a)-[:KNOWS]-(u)-[:KNOWS]-(b) Better: MATCH (a:Foo{key:{a}}), (b:Foo{key:{b}}) MERGE (a)-[:KNOWS]-(u)-[:KNOWS]-(b)
Constructs with intricate semantics
MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key vs MATCH (a)-->(b) MATCH (b)<--(c) RETURN a.key,b.key,c.key key:x key:y
Constructs with intricate semantics
MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key vs MATCH (a)-->(b) MATCH (b)<--(c) RETURN a.key,b.key,c.key key:x key:y <no rows> a.key:’x’, b.key:’y’, c.key:’x’
Constructs with intricate semantics
MATCH (a)-->(b)<--(c) RETURN a.key,b.key,c.key vs MATCH (a)-->(b) MATCH (b)<--(c) RETURN a.key,b.key,c.key key:x key:y <no rows> a.key:’x’, b.key:’y’, c.key:’x’ MATCH (a)-[x]->(b) MATCH (b)<-[y]-(c) WHERE x <> y RETURN a.key,b.key,c.key
“Syntactic sugar” vs single canonical syntax
MATCH (n) WHERE n.foo = "bar" vs MATCH (n{foo: "bar"}) MATCH (n WHERE foo < 10) vs MATCH (n) WHERE n.foo < 10
Predicates on variable length paths
MATCH (a)-[r*]->(b) WHERE all(x IN r WHERE x.weight > 0) would be simpler if it could be written as: MATCH (a)-[r* WHERE weight > 0]->(b) although there are other problems that would come from that…
LOAD CSV will be replaced
LOAD CSV WITH HEADERS FROM "some.csv" AS line CALL apoc.load.csv("some.csv") YIELD map AS line
Parameters avoid SQL injection
Labels and Relationship Types cannot be passed as parameters
MATCH (n:{label}) SET n:{label}
Opening the language design process Implementations for other platforms Compatibility test suite Grammar specification Reference implementation Defining the next version of Cypher
https://github.com/openCypher/openCypher
Future features
Cypher keeps evolving You can get involved through openCypher
MATCH (me:Person{name:{myName}}), (me)-[:FRIEND]-(friend) WITH me, collect(friend) AS friends MATCH (me)-[:ENEMY]-(enemy) RETURN me, friends, collect(enemy) AS enemies
MATCH (me:Person{name:{myName}}), (me)-[:FRIEND]-(friend) WITH me, collect(friend) AS friends MATCH (me)-[:ENEMY]-(enemy) DO { UNWIND friends AS friend MERGE (friend)-[:ENEMY]-(enemy) } DO will replace FOREACH
MATCH (actor:Actor) WHERE EXISTS { (actor)-[:ACTED_IN]->(movie), (other:Actor)-[:ACTED_IN]->(movie) WHERE other.name = actor.name AND actor <> other } RETURN actor
MATCH (me:User{name:{username}})-[:FOLLOWS]->(user) WHERE user.country = {country} MATCH { // authored tweets MATCH (user)<-[:AUTHORED]-(tweet:Tweet) RETURN tweet, tweet.time AS time UNION // favorited tweets MATCH (user)<-[:HAS_FAVOURITE]-(favorite)-[:TARGETS]->(tweet:Tweet) RETURN tweet, favourite.time AS time } RETURN DISTINCT tweet ORDER BY time DESC LIMIT 100
MATCH (person:Person{ssn:{mySSN}}), (person)-[emp:EMPLOYED_BY]->(employer) WHERE NOT exists(emp.endDate) WITH person, employer.name AS employer ORDER BY emp.startDate DESC LIMIT 1 RETURN person{ .ssn, .firstName, .lastName, employer, friends: [ MATCH (person)-[:FRIEND]-(friend) WHERE friend.age > 12 RETURN friend{ .ssn, .firstName, .lastName } ]} Inspired by Facebook’s GraphQL Returns a single column ‘person’ containing: { ssn: “192168-0001”, firstName: “John”, lastName: “Smith”, employer: “The Company, Inc.”, friends: [ {ssn: “009933-1126”, firstName: “Marty”, lastName: “McFly”}, {ssn: “123987-4506”, firstName: “Emmet”, lastName: “Brown”}, ] }