The Dangers and Complexities of SQLite Benchmarking Dhathri - - PowerPoint PPT Presentation

the dangers and complexities of sqlite benchmarking
SMART_READER_LITE
LIVE PREVIEW

The Dangers and Complexities of SQLite Benchmarking Dhathri - - PowerPoint PPT Presentation

The Dangers and Complexities of SQLite Benchmarking Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram 2 3 Benchmarking SQLite is Non-trivial ! Benchmarking complex systems in a repeatable fashion is error prone The main issues


slide-1
SLIDE 1

The Dangers and Complexities of SQLite Benchmarking

Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram

slide-2
SLIDE 2

2

slide-3
SLIDE 3

Benchmarking SQLite is Non-trivial !

  • Benchmarking complex systems in a repeatable fashion is

error prone

  • The main issues with benchmarking :

○ Inconsistency in the industrial benchmarking tools ○ Incorrect reporting of benchmarking results

3

slide-4
SLIDE 4
  • Benchmarking SQLite is hard
  • Depends on several configuration parameters
  • Current tools provide conflicting results(3X) for the

same set of parameters

  • Easy to show conflicting results by tuning

parameters

  • Right configuration can provide massive

performance gains(28X)

4

slide-5
SLIDE 5

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite
  • Conclusion

5

slide-6
SLIDE 6

SQLite

  • Lightweight, embedded, relational database popular in mobile

systems

  • Commonly used benchmark in many mobile applications to store their

data ○

E.g. Twitter and Facebook

  • Used as a benchmark for evaluating several systems

○ E.g. I/O scheduling frameworks (Yang et.al., SOSP ‘15), the Linux read-ahead mechanism (Olivier et.al., SIGBED ‘15)

Benchmarking SQLite is an important part of evaluating these systems.

6

slide-7
SLIDE 7

SQLite architecture

User Space Application Cache Disk DB

7

slide-8
SLIDE 8

SQLite architecture

Cache Disk DB

8

User Space Application

slide-9
SLIDE 9

SQLite architecture

Cache Disk DB Journal

9

User Space Application

slide-10
SLIDE 10

SQLite architecture

fsync() Journal Disk DB Cache

10

Journal fsync() User Space Application

slide-11
SLIDE 11

SQLite architecture

Journal Disk DB Cache

11

Journal User Space Application

slide-12
SLIDE 12

SQLite architecture

Journal Disk DB Cache

12

Journal User Space Application

slide-13
SLIDE 13

SQLite architecture

Cache Journal Disk DB

13

Journal User Space Application

slide-14
SLIDE 14

SQLite architecture

Cache Journal Disk DB

14

Journal User Space Application

slide-15
SLIDE 15

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite
  • Conclusion

15

slide-16
SLIDE 16

Motivation : A Case Study of SQLite

Benchmarking SQLite is tricky - It’s performance varies greatly based on configuration parameters.

➢ Default: Delete journal mode , FULL synchronization mode on Ext4 in Android. ➢ Workload: 1 trial = 30K transactions (10 K inserts, followed by updates and deletes of 10K )

16

slide-17
SLIDE 17

Motivation : A Case Study of SQLite

Benchmarking SQLite is tricky - It’s performance varies greatly based on configuration parameters.

➢ Custom: WAL journal mode with 1MB journal size and NORMAL synchronization mode on F2FS ➢ Default: Delete journal mode , FULL synchronization mode on Ext4 in Android. ➢ Workload: 1 trial = 30K transactions (10 K inserts, followed by updates and deletes of 10K )

17

slide-18
SLIDE 18

Motivation : A Case Study of SQLite

Benchmarking SQLite is tricky - It’s performance varies greatly based on configuration parameters.

➢ Default: Delete journal mode , FULL synchronization mode on Ext4 in Android. ➢ Workload: 1 trial = 30K transactions (10 K inserts, followed by updates and deletes of 10K ) ➢ Custom: WAL journal mode with 1MB journal size and NORMAL synchronization mode on F2FS

28X

18

slide-19
SLIDE 19

Are we reporting it right?

19

slide-20
SLIDE 20

Incomplete specification of benchmarking results

  • 16 papers from the past couple of years, used SQLite to evaluate

performance.

No parameters No sync Mode No WAL Size

10 1 5

NONE of them reported all the parameters required to meaningfully compare results.

20

slide-21
SLIDE 21

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite
  • Conclusion

21

slide-22
SLIDE 22

Inconsistency in existing benchmarking tools

  • Results between the tools differ by 50% in their default setting

Tool

Default TPS Custom TPS Papers that use MobiBench RL Bench AndroBench 20 30 29 57

  • 150

7 4 3

  • Differ by 3X when a single parameter is changed.

Misleading and meaningless to compare, if parameters are not reported!

22

slide-23
SLIDE 23

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite
  • Conclusion

23

slide-24
SLIDE 24

Parameters affecting SQLite Performance

  • 1. Filesystem
  • 2. Journaling Mode
  • 3. Pre-population of database
  • 4. Synchronization Mode
  • 5. Journal Size

24

slide-25
SLIDE 25

Hardware Setup for experimentation

  • Experiments performed
  • n Samsung Galaxy

Nexus S on 32GB internal storage.

  • Controlled experimental

setup : Vary one parameter, while keeping all others constant.

25

slide-26
SLIDE 26

Workload

  • 1 trial = 3000 transactions (1000 inserts, followed by 1000 updates

and 1000 deletes)

  • Database prepopulated with 100K rows.
  • Results reported as throughput (transactions/sec)
  • Default Configuration :

○ DELETE journal mode ○ FULL synchronization mode ○ Ext4 filesystem in ordered mode.

26

slide-27
SLIDE 27

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite

○ Filesystem ○ Journal Mode ○ Pre-population of the database ○ Synchronization mode ○ Journal Size

  • Conclusion

27

slide-28
SLIDE 28
  • 1. Filesystem
  • Application writes are transformed into block level operations by filesystem.

28

slide-29
SLIDE 29
  • 1. Filesystem

DELETE - Normal

29

slide-30
SLIDE 30
  • 1. Filesystem

DELETE - Normal

30

slide-31
SLIDE 31
  • 1. Filesystem

DELETE - Normal WAL - Normal

31

slide-32
SLIDE 32
  • 1. Filesystem

DELETE - Normal WAL - Normal

32

slide-33
SLIDE 33
  • 1. Filesystem

DELETE - Normal DELETE - FULL WAL - Normal

33

slide-34
SLIDE 34
  • 1. Filesystem
  • Depending on the parameters chosen, we can show either one performing

better.

  • F2fs paper evaluates only WAL mode : claims better performance than ext4.

DELETE - Normal DELETE - FULL WAL - Normal

34

slide-35
SLIDE 35

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite

○ Filesystem ○ Journal Mode ○ Pre-population of the database ○ Synchronization mode ○ Journal Size

  • Conclusion

35

slide-36
SLIDE 36
  • 2. Journaling mode
  • Defines the type of SQLite journal used.

○ DELETE : Default mode ■ Uses traditional rollback journaling mechanism: contents

  • f the database is written on to the journal and the

changes are written to the database file directly.

36

slide-37
SLIDE 37

DELETE Journal mode revisited

Cache Journal Disk DB

37

Journal User Space Application

slide-38
SLIDE 38
  • 2. Journaling mode
  • Defines the type of SQLite journal used.

○ DELETE : Default mode ■ Uses traditional rollback journaling mechanism: contents

  • f the database is written on to the journal and the

changes are written to the database file directly. ○ WAL : ■ Write-ahead log, in which the changes to the database are written to the journal and is committed to the database when user explicitly triggers it.

38

slide-39
SLIDE 39

WAL journal mode

Cache Disk

39

User Space Application

slide-40
SLIDE 40

WAL journal mode

Tx : 1 Tx : 1

C O M M I T

WAL Cache Disk

40

User Space Application

slide-41
SLIDE 41

WAL journal mode

Cache Tx : 1

C O M M I T

WAL Disk

41

User Space Application

slide-42
SLIDE 42

WAL journal mode

User Space Application Tx : 2 Tx : 1

C O M M I T

Tx : 2

C O M M I T

WAL Cache Disk

42

slide-43
SLIDE 43

WAL journal mode - checkpointing

User Space Application Tx : 2 Tx : 1

C O M M I T

Tx : 2

C O M M I T

WAL Cache Disk

Checkpoint

43

slide-44
SLIDE 44
  • 2. Journaling mode

○ OFF: ■ No Rollback journal ■ Likely corruption on crash

44

slide-45
SLIDE 45
  • 2. Journaling mode
  • X-axis : Journaling mode
  • Y-axis : Results reported in

transactions/sec

45

slide-46
SLIDE 46
  • 2. Journaling mode
  • DELETE :

Max TPS of 30 achieved

46

slide-47
SLIDE 47
  • 2. Journaling mode
  • WAL :

Max TPS of 270 achieved

47

slide-48
SLIDE 48
  • 2. Journaling mode
  • WAL 10X better than

DELETE

  • Journal deleted after each

commit in DELETE mode.

  • For 1000 SQLite inserts,

○ WAL : 1000 fsync() ○ DELETE : 5000 fsync()

48

slide-49
SLIDE 49

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite

○ Filesystem ○ Journal Mode ○ Pre-population of the database ○ Synchronization mode ○ Journal Size

  • Conclusion

49

slide-50
SLIDE 50
  • 3. Pre-population of database
  • Necessary to ensure realistic performance estimates.

50

slide-51
SLIDE 51
  • 3. Pre-population of database
  • Necessary to ensure realistic performance estimates.
  • Almost 2X performance

difference

  • Benchmarking tools don’t
  • prepopulate. Unrealistic

numbers.

51

slide-52
SLIDE 52

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite

○ Filesystem ○ Journal Mode ○ Pre-population of the database ○ Synchronization mode ○ Journal Size

  • Conclusion

52

slide-53
SLIDE 53
  • 4. Synchronization Mode
  • Controls the frequency of fsync() issued by SQLite library.

○ FULL :

■ Writes to database(calls fsync()) on each commit.

53

slide-54
SLIDE 54

FULL Synchronization in WAL

Tx : 1 Tx : 1

C O M M I T C O M M I T

WAL Cache Disk

Checkpoint

54

User Space Application

slide-55
SLIDE 55
  • 4. Synchronization Mode
  • Controls the frequency of fsync() issued by SQLite library.

○ FULL :

■ Writes to database(calls fsync()) on each commit.

○ NORMAL:

■ Writes to log on each commit.

55

slide-56
SLIDE 56

NORMAL Synchronization in WAL

Cache Disk

Checkpoint

Tx : 1 Tx : 100 Tx : 1

C O M M I T

Tx : 100

C O M M I T S Y N C

56

User Space Application

slide-57
SLIDE 57
  • 4. Synchronization Mode
  • Controls the frequency of fsync() issued by SQLite library.

○ FULL :

■ Writes to database(calls fsync()) on each commit.

○ NORMAL:

■ Writes to log on each commit.

○ OFF:

■ Consistency mechanism left to the OS.

57

slide-58
SLIDE 58
  • 4. Synchronization Mode
  • X-axis : Synchronization

mode

  • Y-axis : Results reported in

transactions/sec

58

slide-59
SLIDE 59
  • 4. Synchronization Mode
  • FULL :

Max TPS : 30

59

slide-60
SLIDE 60
  • 4. Synchronization Mode
  • NORMAL :

Max TPS : 45

60

slide-61
SLIDE 61
  • 4. Synchronization Mode
  • NORMAL : 1.5X better than

FULL .

  • To strike balance between

durability and performance, use WAL+NORMAL

61

slide-62
SLIDE 62

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite

○ Filesystem ○ Journal Mode ○ Pre-population of the database ○ Synchronization mode ○ Journal Size

  • Conclusion

62

slide-63
SLIDE 63
  • 5. Journal Size
  • In WAL mode, journal can grow unbounded
  • Potentially affects read performance.

63

slide-64
SLIDE 64
  • 5. Journal Size
  • In WAL mode, journal can grow unbounded
  • Potentially affects read performance.

64

slide-65
SLIDE 65
  • 5. Journal Size
  • In WAL mode, journal can grow unbounded
  • Potentially affects read performance.
  • Performance improves with

increase in journal size

65

slide-66
SLIDE 66
  • 5. Journal Size
  • In WAL mode, journal can grow unbounded
  • Potentially affects read performance.
  • Performance improves with

increase in journal size

  • When WAL is full - triggers

checkpoint.

  • Smaller WAL => more

checkpointing

66

slide-67
SLIDE 67
  • 5. Journal Size
  • In WAL mode, journal can grow unbounded
  • Potentially affects read performance.
  • Performance improves with

increase in journal size

  • When WAL is full - triggers

checkpoint.

  • Smaller WAL => more

checkpointing

  • Saturates beyond a point

67

slide-68
SLIDE 68

Outline

  • Overview of SQLite
  • Motivation
  • Existing tools to benchmark SQLite
  • Parameters affecting performance of SQLite
  • Conclusion

68

slide-69
SLIDE 69

Conclusion

  • The Systems community has discussed in the past, how tricky

benchmarking can be.

  • But in practice, we have shown that industrial benchmarking tools are

inconsistent, and academic reporting of results is incomplete.

  • Draw attention to:

○ Developers and researchers must understand the impact of various parameters on SQLite performance. ○ To ensure repeatable and comparable results, reporting configuration parameters is vital.

69

slide-70
SLIDE 70

THANK YOU.. Questions ?

Jayashree Mohan jaya@cs.utexas.edu

70

slide-71
SLIDE 71

BACKUP SLIDES

71

slide-72
SLIDE 72

Hardware Setup for experimentation

  • Experiments performed
  • n Samsung Galaxy

Nexus S

  • Controlled experimental

setup : Vary one parameter, while keeping all others constant.

CPU Dual Core 1.2GHz Cortex A9 Memory 32GB internal, 1GB RAM Android 6.0.1(cyanogenmod 13) Kernel 3.0.101 (F2FS enabled) Battery 3.7V, 1850mAh

72