Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer kees@neo4j.com
Query Log Analyzer
Improve Query Performance with the Query Log Analyzer Kees Vegter - - PowerPoint PPT Presentation
Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is
Query Log Analyzer
2
dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is logged. If set to zero then all queries dbms.logs.query.threshold=100ms dbms.logs.query.parameter_logging_enabled=true dbms.logs.query.time_logging_enabled=true dbms.logs.query.allocation_logging_enabled=true dbms.logs.query.page_logging_enabled=true dbms.track_query_cpu_time=true dbms.track_query_allocation=true
3
Query Analysis
4
Query Log: Filter
5
Query Log: Highlight
6
Query Timeline
7
Cypher Planning
Cypher Execution
Physical Execution Plan
Query String Parse Logical Plan Physical Execution Plan Execute Physical Plan in Cypher Runtime Query Plan Cache Query String Execute Physical Plan in Cypher Runtime
uses db-statistics
Use query parameters! Use repeatable statements!
8
Cypher Execution Cypher Planning Query Load
MATCH p=(ah:AccountHolder {fullName :$accountName })
WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() RETURN p, p2 SKIP 0 LIMIT 1000
Query Log Analyzer
9
Cypher Planning
○ Check the tool header ○ Check for parameter usage in your queries
1775 queries analysed, 302 distinct queries found. 1775 queries analysed, 1775 distinct queries found.
MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName ... RETURN ah MATCH (ah:AccountHolder) WHERE ah.fullName = "John Smith" ... RETURN ah
Cypher Execution
10
24 % : read from Cache 76 % : read from Disk
11
Query Load
12
Query Tuning
13
Use Explain and Profile
Things to check:
14
Avoid Cartesian Products
… OPTIONAL MATCH OPTIONAL MATCH OPTIONAL MATCH ... MATCH (a), (b), (c) RETURN a, b, c … UNWIND arrA as a UNWIND arrB as b UNWIND arrC as c ... Use WITH and COLLECT and DISTINCT to reduce the intermediate results Use Pattern Comprehension when applicable:
MATCH (a) RETURN { a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] }
15
Reduce the query working set as soon as possible
point in the query?
in the query?
query to reduce the amount of rows to be processed?
16
Query Execution
Query Tuning
MERGE (author1:Author {id: 1}) MERGE (author2:Author {id: 2}) ... MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"}) ... MERGE (author1)-[:WROTE]->(book1) MERGE (author2)-[:WROTE]->(book2) ... MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)
17
Query Execution
Query Tuning
UNWIND $inputList as row MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book) FOR EVERY 100 ENTRIES IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J
{ inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] }
MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book) FOR EVERY ENTRY IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J
{ authorId : 1, bookTitle : "title1" }
18
Query Execution
Query Tuning
CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row' ,'MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)' ,{batchSize : 100, iterateList: true } )
19
use the following tool to do a quick analysis of the query.log file:
https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/
This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time.
20
Still under development
21
preview, still under development
22
preview, still under development
23
preview, still under development
24
preview, still under development
25
Introducing the Query Log Analyzer
https://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6
Cypher Query Optimisations
https://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c
Script to get the top 10 most expensive queries from the command line
https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/
1. Easy: What does Avg Waiting stand for?
a. Waiting to execute query b. Waiting to execute query + waiting for locks c. Waiting for locks
2. Medium: What is the correct order of steps in The Cypher Query Processing
a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtime c. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime
3. Hard: What is the name of config parameter in apoc.periodic.iterate to make batch updates possible? Answer here: r.neo4j.com/hunger-games
27
install
https://install.graphapp.io/