The ODIn Lab @
Pocket Data
The Case for TPC-MOBILE
Oliver Kennedy, Jerry Ajay, Geoff Challen, Lukasz Ziarek http://odin.cse.buffalo.edu/research/astral
1
Pocket Data The Case for TPC-MOBILE Oliver Kennedy, Jerry Ajay, - - PowerPoint PPT Presentation
Pocket Data The Case for TPC-MOBILE Oliver Kennedy, Jerry Ajay, Geoff Challen, Lukasz Ziarek http://odin.cse.buffalo.edu/research/astral 1 The ODIn Lab @ Big Data! 2 The ODIn Lab @ Big Data! 2 The ODIn Lab @ Big Data! GB, TB or PB of
The ODIn Lab @
Oliver Kennedy, Jerry Ajay, Geoff Challen, Lukasz Ziarek http://odin.cse.buffalo.edu/research/astral
1
The ODIn Lab @
2
The ODIn Lab @
2
The ODIn Lab @
updates per second
computing together!
3
The ODIn Lab @
4
The ODIn Lab @
TPC-C TPC-H YCSB SSB TPC-DS TPC-DI TPC-E
4
The ODIn Lab @
TPC-C TPC-H YCSB SSB TPC-DS TPC-DI TPC-E What about other types of databases?
4
The ODIn Lab @
The average smartphone processes almost 180 thousand queries per day That’s about 2 queries per second
5
The ODIn Lab @
6
The ODIn Lab @
6
The ODIn Lab @
6
The ODIn Lab @
more than just a DB.
6
The ODIn Lab @
more than just a DB.
under ideal circumstances.
6
The ODIn Lab @
more than just a DB.
under ideal circumstances.
that your phone is running some queries right now!
6
The ODIn Lab @
more than just a DB.
under ideal circumstances.
that your phone is running some queries right now!
6
The ODIn Lab @
We need to better understand pocket-scale data
7
The ODIn Lab @
specific to one client “app”.
DB is backed to one file.
library is available by default in nearly all major OSes.
Relaxed SQL Syntax, One Big Lock (file)
8
Client Application Android OS SQLite Library
The ODIn Lab @
9
How do developers and users use Pocket Scale Data?
The ODIn Lab @
A Smartphone Platform Testbed
10
The ODIn Lab @
A Smartphone Platform Testbed
11
https://phone-lab.org/experiment/request
The ODIn Lab @
12
The ODIn Lab @
12
The ODIn Lab @
13
1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 Maximum Nesting Depth 0.1 1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 5 6 7 8 Number of SELECT Queries Number of Tables Accessed
The ODIn Lab @
13
1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 Maximum Nesting Depth 0.1 1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 5 6 7 8 Number of SELECT Queries Number of Tables Accessed
30 million simple “SPA” queries
The ODIn Lab @
13
1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 Maximum Nesting Depth 0.1 1 10 100 1000 10000 100000 1x106 1x107 1x108 1 2 3 4 5 6 7 8 Number of SELECT Queries Number of Tables Accessed
30 million simple “SPA” queries Infrequent, but extremely complex queries
The ODIn Lab @
14
(by app)
CDF % of SELECT Queries That Are Key-Value Queries 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
The ODIn Lab @
14
(by app)
CDF % of SELECT Queries That Are Key-Value Queries 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
24 / 179 apps using SQLite
The ODIn Lab @
15
INSERT OR REPLACE INTO properties(property_key,property_value) VALUES (?,?); SELECT property_value FROM properties WHERE property_key=?;
(These are actual real queries from the trace)
The ODIn Lab @
100 101 102 103 104 Returned Row Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
16
The ODIn Lab @
100 101 102 103 104 Returned Row Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
16
80% of SELECTs return one row
The ODIn Lab @
100 101 102 103 104 Returned Row Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
16
80% of SELECTs return one row Small % of SELECTs return 100s of rows
The ODIn Lab @
100 101 102 103 104 Returned Row Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
17
(by app)
The ODIn Lab @
100 101 102 103 104 Returned Row Count 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
17
(by app) Google+, GMail, Facebook, Contacts
The ODIn Lab @
18
The ODIn Lab @
19
Image courtesy of http://openclipart.org
Object-Relational Mapper
The ODIn Lab @
19
Image courtesy of http://openclipart.org
Object-Relational Mapper pers = Persons.get(10) name = pers.firstName()
The ODIn Lab @
19
Image courtesy of http://openclipart.org
Object-Relational Mapper pers = Persons.get(10) name = pers.firstName() SELECT first_name FROM Persons WHERE id = 10;
The ODIn Lab @
19
Image courtesy of http://openclipart.org
Object-Relational Mapper pers = Persons.get(10) name = pers.firstName() SELECT first_name FROM Persons WHERE id = 10; SQL DB used for persisting objects
The ODIn Lab @
20
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10)
name = org.name()
The ODIn Lab @
20
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10)
name = org.name() SELECT employer_id FROM Persons WHERE id = 10; SELECT name FROM Organizations WHERE id = ?;
The ODIn Lab @
20
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10)
name = org.name() SELECT employer_id FROM Persons WHERE id = 10; SELECT name FROM Organizations WHERE id = ?; ORMs are not always efficient
The ODIn Lab @
21
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10) pers.setSalary( pers.salary() * 1.1 )
The ODIn Lab @
21
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10) pers.setSalary( pers.salary() * 1.1 ) SELECT salary FROM Persons WHERE id = 10; UPDATE Persons SET salary = ? WHERE id = 10;
The ODIn Lab @
21
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10) pers.setSalary( pers.salary() * 1.1 ) SELECT salary FROM Persons WHERE id = 10; UPDATE Persons SET salary = ? WHERE id = 10; We saw NO update value computations in SQL
The ODIn Lab @
22
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10) pers.setSalary( pers.salary() * 1.1 ) SELECT salary FROM Persons WHERE id = 10; INSERT OR REPLACE INTO Persons(id, salary) VALUES (?, 10);
The ODIn Lab @
22
Object-Relational Mapper
Image courtesy of http://openclipart.org
pers = Persons.get(10) pers.setSalary( pers.salary() * 1.1 ) SELECT salary FROM Persons WHERE id = 10; INSERT OR REPLACE INTO Persons(id, salary) VALUES (?, 10); Insert or Replace used very frequently
The ODIn Lab @
23
The ODIn Lab @
24
Function Call Sites GROUP_CONCAT 583,474 SUM 321,387 MAX 314,970 COUNT 173,031 MIN 19,566 AVG 15
The ODIn Lab @
24
Function Call Sites GROUP_CONCAT 583,474 SUM 321,387 MAX 314,970 COUNT 173,031 MIN 19,566 AVG 15
Aggregates most common function type
The ODIn Lab @
25
Function Call Sites GROUP_CONCAT 583,474 SUM 321,387 MAX 314,970 COUNT 173,031 MIN 19,566 AVG 15
Concatenate all strings in a column: Non-algebraic
The ODIn Lab @
26
The ODIn Lab @
26
The ODIn Lab @
26
The ODIn Lab @
26
The ODIn Lab @
27
The ODIn Lab @
28
CDF Read/Write Ratio (100% = All Reads) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
The ODIn Lab @
28
CDF Read/Write Ratio (100% = All Reads) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
~15% of apps write more frequently than they read
The ODIn Lab @
28
CDF Read/Write Ratio (100% = All Reads) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
~15% of apps write more frequently than they read ~15% of apps do not perform a single write!
The ODIn Lab @
29
The ODIn Lab @
29
The ODIn Lab @
29
The ODIn Lab @
30
The ODIn Lab @
31
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
The ODIn Lab @
31
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
15-20% of queries arrive ~10ms after last query
The ODIn Lab @
32
100 101 102 103 104 105 106 107 108 Next Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
The ODIn Lab @
32
100 101 102 103 104 105 106 107 108 Next Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
70% of inserts come less than 0.1 ms before another query
The ODIn Lab @
32
100 101 102 103 104 105 106 107 108 Next Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
UPDATE SELECT INSERT DELETE
70% of inserts come less than 0.1 ms before another query Most sequences consist of INSERTs and SELECTs
The ODIn Lab @
33
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
Google Play services Media Storage Gmail Google+ Facebook Hangouts Android System Messenger Calendar Storage Contacts Storage
The ODIn Lab @
33
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
Google Play services Media Storage Gmail Google+ Facebook Hangouts Android System Messenger Calendar Storage Contacts Storage
Google Play Services and Media Storage are VERY bursty
The ODIn Lab @
33
100 101 102 103 104 105 106 107 108 Previous Query Arrival Time (µs) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 CDF
Google Play services Media Storage Gmail Google+ Facebook Hangouts Android System Messenger Calendar Storage Contacts Storage
Google Play Services and Media Storage are VERY bursty Same 10 ms periodicity evident across all apps
The ODIn Lab @
before we can optimize for them.
34
We need a TPC-MOBILE for pocket-scale data!