Improve Query Performance with the Query Log Analyzer Kees Vegter - PowerPoint PPT Presentation

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com

Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is logged. If set to zero then all queries dbms.logs.query.threshold=100ms dbms.logs.query.parameter_logging_enabled=true dbms.logs.query.time_logging_enabled=true dbms.logs.query.allocation_logging_enabled=true dbms.logs.query.page_logging_enabled=true dbms.track_query_cpu_time=true dbms.track_query_allocation=true 2

Query Log Analyzer Query Analysis 3

Query Log Analyzer Query Log: Filter 4

Query Log Analyzer Query Log: Highlight 5

Query Log Analyzer Query Timeline 6

Cypher Query Processing Cypher Cypher Physical Execution Plan Planning Execution Use query parameters! uses db-statistics Physical Logical Execute Physical Plan in Query String Parse Execution Plan Cypher Runtime Plan Query Execute Physical Plan in Plan Query String Cypher Runtime Cache Use repeatable statements! 7

Cypher Cypher Query Planning Execution Load MATCH p=(ah:AccountHolder {fullName :$accountName }) -[:HAS_BANKACCOUNT]->(ba)-[:SEND*2..16]->() WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() Query Log Analyzer RETURN p, p2 SKIP 0 LIMIT 1000 8

Cypher Planning Cypher Planning 1775 queries analysed, 302 distinct queries found. Parameter Usage ● 1775 queries analysed, 1775 distinct queries found. Check the tool header ○ MATCH (ah:AccountHolder) Check for parameter usage in WHERE ah.fullName = "John Smith" ○ ... your queries RETURN ah MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName ... RETURN ah Planning time ● 9

Cypher Execution Cypher Execution Page Cache (data cache) 24 % : read from Cache ● 76 % : read from Disk Waiting for Locks ● Memory Footprint ● 10

Query Load Query Load Locking ● Concurrent Load ● Big Result Sets ● 11

Query Tuning Query Tuning Tips 12

Query Tuning Use Explain and Profile Things to check: Index usage ● Eager ● NodeByLabelScan ● AllNodesScan ● 13

Query Tuning Avoid Cartesian Products MATCH (a), (b), (c) … … RETURN a, b, c OPTIONAL MATCH UNWIND arrA as a OPTIONAL MATCH UNWIND arrB as b OPTIONAL MATCH UNWIND arrC as c ... ... Use WITH and COLLECT and DISTINCT to reduce the intermediate results Use Pattern Comprehension when applicable: MATCH (a) RETURN { a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] } 14

Query Tuning Reduce the query working set as soon as possible Can I move a DISTINCT to an earlier ● point in the query? Can I move a LIMIT to an earlier point ● in the query? Can I use COLLECT on places in the ● query to reduce the amount of rows to be processed? 15

Query Tuning Query Tuning Query Execution Try to send ‘repeatable’ statements ● MERGE (author1:Author {id: 1}) MERGE (author:Author {id: $authorId }) MERGE (author2:Author {id: 2}) MERGE (book:Book {title: $bookTitle }) ... MERGE (author)-[:WROTE]->(book) MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"}) ... MERGE (author1)-[:WROTE]->(book1) MERGE (author2)-[:WROTE]->(book2) ... 16

Query Tuning Query Tuning Query Execution Reduce the amount of statements you send to Neo4j by using 'batch' statements ● MERGE (author:Author {id: $authorId }) FOR EVERY ENTRY IN LIST MERGE (book:Book {title: $bookTitle }) WITH AUTHORS AND BOOKS MERGE (author)-[:WROTE]->(book) FIRE A STATEMENT TO NEO4J { authorId : 1, bookTitle : "title1" } UNWIND $inputList as row FOR EVERY 100 ENTRIES IN MERGE (author:Author {id: row.authorId LIST WITH AUTHORS AND BOOKS }) FIRE A STATEMENT TO NEO4J MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book) { inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] } 17

Query Tuning Query Tuning Query Execution Use apoc.periodic.iterate with the config parameter iterateList : true ! ● CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row' ,'MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)' ,{batchSize : 100, iterateList: true } ) kettle also uses this 'batch' approach ● 18

Tool Usage The Query Log Analyzer is meant to be used during development and testing! ● When you have only a command prompt available on a neo4j server you can also ● use the following tool to do a quick analysis of the query.log file: https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/ This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time. 19

Next Version Supports Neo4j version 4 (multi db) ● List Current queries ● List Query Stats (version 3.5.4 and higher) ● Explain Plan ● 20 Still under development

Multi db support 21 preview, still under development

Current Queries 22 preview, still under development

Queries Stats 23 preview, still under development

Explain Plan 24 preview, still under development

Useful links Introducing the Query Log Analyzer https://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6 Cypher Query Optimisations https://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c Script to get the top 10 most expensive queries from the command line https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/ 25

Q & A Hunger Games Questions for " Improve Query Performance with Query Log Analyzer " 1. Easy : What does Avg Waiting stand for? a. Waiting to execute query b. Waiting to execute query + waiting for locks c. Waiting for locks 2. Medium : What is the correct order of steps in The Cypher Query Processing a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtime c. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime 3. Hard : What is the name of config parameter in apoc.periodic.iterate to make batch updates possible? Answer here: r.neo4j.com/hunger-games

Query Log Analyzer install https://install.graphapp.io/ 27

Improve Query Performance with the Query Log Analyzer Kees Vegter - PowerPoint PPT Presentation

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

Infrared Gas Analyzer - component analyzer - component analyzer Type: ZRJ Standard type Type:

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

BC-5300 Auto Hematology Analyzer Satisfaction in test BC-5300 Auto Hematology Analyzer The new

BC-5380 Auto Hematology Analyzer Satisfaction in test BC-5380 Auto Hematology Analyzer The new

Developing the Clang Static Analyzer Artem Dergachev, Apple Clang Static Analyzer Finds bugs

CS320: Performance Evaluation Plotting data sets Semi log plots Log log plots Analyzing Program

CS320: Performance Evaluation Plotting data sets Semi-log plots Log-log plots Analyzing Program

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Section 3.7 Derivatives of logarithmic functions 1 Rules of exponentials and logarithms 1.

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

FC80 Free Chlorine Analyzer E LECTRO- C HEMICAL D EVICES FC80 System Configuration Free

PSL Analyzer Products PQube 3e PQube 3v PQube 3 PQube 3 - the best power analyzer PQube 3 is

Hand-held laser analyzer chemical composition metals and alloys Elemental Laser AN ANalyzer

Log all the things! Honza Krl @honzakral Logs? Events! Log lines Twitter feed Invoices

Detecting Large-Scale System Problems by Mining Console Logs Author : Wei Xu, Ling Huang,

Inferring Models of Concurrent Systems from Logs of Their Behavior with CSight A?a-1 timeout s0

Dynamic Policy Enforcement Dynamic Policy Enforcement in a Networked Environment in a Networked

Creating LaTeX and HTML documents from within Stata using texdoc and webdoc Ben Jann University

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Harvesting Logs and Events Using MetaCentrum Virtualization Services Radoslav Bod, Daniel

SECFUZZ: Fuzz-testing Security Protocols Petar Tsankov, Mohammad Torabi Dashti, David Basin ETH