Improve Query Performance with the Query Log Analyzer Kees Vegter - - PowerPoint PPT Presentation

improve query performance with the query log analyzer
SMART_READER_LITE
LIVE PREVIEW

Improve Query Performance with the Query Log Analyzer Kees Vegter - - PowerPoint PPT Presentation

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is


slide-1
SLIDE 1

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer kees@neo4j.com

Query Log Analyzer

slide-2
SLIDE 2

2

Query Log

dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is logged. If set to zero then all queries dbms.logs.query.threshold=100ms dbms.logs.query.parameter_logging_enabled=true dbms.logs.query.time_logging_enabled=true dbms.logs.query.allocation_logging_enabled=true dbms.logs.query.page_logging_enabled=true dbms.track_query_cpu_time=true dbms.track_query_allocation=true

slide-3
SLIDE 3

3

Query Log Analyzer

Query Analysis

slide-4
SLIDE 4

4

Query Log Analyzer

Query Log: Filter

slide-5
SLIDE 5

5

Query Log Analyzer

Query Log: Highlight

slide-6
SLIDE 6

6

Query Log Analyzer

Query Timeline

slide-7
SLIDE 7

7

Cypher Query Processing

Cypher Planning

Cypher Execution

Physical Execution Plan

Query String Parse Logical Plan Physical Execution Plan Execute Physical Plan in Cypher Runtime Query Plan Cache Query String Execute Physical Plan in Cypher Runtime

uses db-statistics

Use query parameters! Use repeatable statements!

slide-8
SLIDE 8

8

Cypher Execution Cypher Planning Query Load

MATCH p=(ah:AccountHolder {fullName :$accountName })

  • [:HAS_BANKACCOUNT]->(ba)-[:SEND*2..16]->()

WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() RETURN p, p2 SKIP 0 LIMIT 1000

Query Log Analyzer

slide-9
SLIDE 9

9

Cypher Planning

Cypher Planning

  • Parameter Usage

○ Check the tool header ○ Check for parameter usage in your queries

  • Planning time

1775 queries analysed, 302 distinct queries found. 1775 queries analysed, 1775 distinct queries found.

MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName ... RETURN ah MATCH (ah:AccountHolder) WHERE ah.fullName = "John Smith" ... RETURN ah

slide-10
SLIDE 10

Cypher Execution

  • Page Cache (data cache)
  • Waiting for Locks
  • Memory Footprint

10

Cypher Execution

24 % : read from Cache 76 % : read from Disk

slide-11
SLIDE 11
  • Locking
  • Concurrent Load
  • Big Result Sets

11

Query Load

Query Load

slide-12
SLIDE 12

Query Tuning Tips

12

Query Tuning

slide-13
SLIDE 13

13

Query Tuning

Use Explain and Profile

Things to check:

  • Index usage
  • Eager
  • NodeByLabelScan
  • AllNodesScan
slide-14
SLIDE 14

14

Query Tuning

Avoid Cartesian Products

… OPTIONAL MATCH OPTIONAL MATCH OPTIONAL MATCH ... MATCH (a), (b), (c) RETURN a, b, c … UNWIND arrA as a UNWIND arrB as b UNWIND arrC as c ... Use WITH and COLLECT and DISTINCT to reduce the intermediate results Use Pattern Comprehension when applicable:

MATCH (a) RETURN { a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] }

slide-15
SLIDE 15

15

Query Tuning

Reduce the query working set as soon as possible

  • Can I move a DISTINCT to an earlier

point in the query?

  • Can I move a LIMIT to an earlier point

in the query?

  • Can I use COLLECT on places in the

query to reduce the amount of rows to be processed?

slide-16
SLIDE 16

16

Query Tuning

Query Execution

Query Tuning

  • Try to send ‘repeatable’ statements

MERGE (author1:Author {id: 1}) MERGE (author2:Author {id: 2}) ... MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"}) ... MERGE (author1)-[:WROTE]->(book1) MERGE (author2)-[:WROTE]->(book2) ... MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book)

slide-17
SLIDE 17

17

Query Tuning

Query Execution

Query Tuning

  • Reduce the amount of statements you send to Neo4j by using 'batch' statements

UNWIND $inputList as row MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book) FOR EVERY 100 ENTRIES IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J

{ inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] }

MERGE (author:Author {id: $authorId }) MERGE (book:Book {title: $bookTitle }) MERGE (author)-[:WROTE]->(book) FOR EVERY ENTRY IN LIST WITH AUTHORS AND BOOKS FIRE A STATEMENT TO NEO4J

{ authorId : 1, bookTitle : "title1" }

slide-18
SLIDE 18

18

Query Tuning

Query Execution

Query Tuning

  • Use apoc.periodic.iterate with the config parameter iterateList : true !

CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row' ,'MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)' ,{batchSize : 100, iterateList: true } )

  • kettle also uses this 'batch' approach
slide-19
SLIDE 19

19

Tool Usage

  • The Query Log Analyzer is meant to be used during development and testing!
  • When you have only a command prompt available on a neo4j server you can also

use the following tool to do a quick analysis of the query.log file:

https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/

This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time.

slide-20
SLIDE 20

20

Next Version

  • Supports Neo4j version 4 (multi db)
  • List Current queries
  • List Query Stats (version 3.5.4 and higher)
  • Explain Plan

Still under development

slide-21
SLIDE 21

21

Multi db support

preview, still under development

slide-22
SLIDE 22

22

Current Queries

preview, still under development

slide-23
SLIDE 23

23

Queries Stats

preview, still under development

slide-24
SLIDE 24

24

Explain Plan

preview, still under development

slide-25
SLIDE 25

Useful links

25

Introducing the Query Log Analyzer

https://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6

Cypher Query Optimisations

https://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c

Script to get the top 10 most expensive queries from the command line

https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/

slide-26
SLIDE 26

Hunger Games Questions for "Improve Query Performance with Query Log Analyzer"

1. Easy: What does Avg Waiting stand for?

a. Waiting to execute query b. Waiting to execute query + waiting for locks c. Waiting for locks

2. Medium: What is the correct order of steps in The Cypher Query Processing

a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtime c. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime

3. Hard: What is the name of config parameter in apoc.periodic.iterate to make batch updates possible? Answer here: r.neo4j.com/hunger-games

Q & A

slide-27
SLIDE 27

27

Query Log Analyzer

install

https://install.graphapp.io/