More Than Storage Margo Seltzer Canada 150 Research Chair in - - PowerPoint PPT Presentation

more than storage
SMART_READER_LITE
LIVE PREVIEW

More Than Storage Margo Seltzer Canada 150 Research Chair in - - PowerPoint PPT Presentation

More Than Storage Margo Seltzer Canada 150 Research Chair in Computer Systems University of British Columbia 1 May 2019 How do you build a mechanical computing device? 2 May 2019 Computer Systems Hardware, Software, and Programming 3


slide-1
SLIDE 1

More Than Storage

Margo Seltzer Canada 150 Research Chair in Computer Systems University of British Columbia

May 2019 1

slide-2
SLIDE 2

How do you build a mechanical computing device?

May 2019 2

slide-3
SLIDE 3

Computer Systems

May 2019 3

Hardware, Software, and Programming

slide-4
SLIDE 4

Computer Systems

Hardware Software

May 2019 4

slide-5
SLIDE 5

Computer Systems

Databases Hardware Programming Languages Operating Systems

May 2019 5

slide-6
SLIDE 6

Computer Systems

Databases Programming Languages Operating Systems HPC Architecture

May 2019 6

slide-7
SLIDE 7

Computer Systems

Databases Programming Languages HPC Architecture

Operating Systems Distributed Computing Networking

May 2019 7

slide-8
SLIDE 8

Computer Systems

Programming Languages HPC Architecture

Operating Systems Distributed Computing Networking Security

Databases

May 2019 8

slide-9
SLIDE 9

Computer Systems

Programming Languages HPC Architecture

Operating Systems Distributed Computing Networking Security Storage Databases

May 2019 9

slide-10
SLIDE 10

Computer Systems

HPC Architecture

Operating Systems Distributed Computing Networking Security Storage Databases Scientific Computing

Programming Languages

May 2019 10

slide-11
SLIDE 11

Computer Systems

HPC Architecture

Operating Systems Distributed Computing Networking Security Storage Databases Scientific Computing Programming Languages Networked Systems

May 2019 11

slide-12
SLIDE 12

Computer Systems

HPC Architecture

Operating Systems Distributed Computing Networking Security Storage Databases Scientific Computing Programming Languages Networked Systems Mobile

May 2019 12

slide-13
SLIDE 13

Computer Systems

Architecture Operating Systems Distributed Computing Networking Security Storage Databases Scientific Computing Programming Languages Networked Systems Mobile Embedded Systems IoT HPC

May 2019 13

slide-14
SLIDE 14

Computer Systems

Architecture Security Networked Systems HPC VLSI Operating Systems Programming Languages Storage Databases Scientific Computing Distributed Computing Networking Embedded Systems IoT Mobile

May 2019 14

slide-15
SLIDE 15

Computer Systems

Architecture Security Networked Systems HPC VLSI Operating Systems Programming Languages Storage Databases Scientific Computing Distributed Computing Networking Embedded Systems IoT Mobile

May 2019 15

slide-16
SLIDE 16

Storage?

Architecture Security Networked Systems HPC VLSI Operating Systems Programming Languages Storage Databases Scientific Computing Distributed Computing Networking Embedded Systems IoT Mobile

May 2019 16

Databases Distributed Computing Networked Systems Scientific Computing Architecture HPC Security

slide-17
SLIDE 17

Three Storage Vignettes

May 2019 17

Runtime Provenance Applications

Keys Values

Session1 Cidon, Manno, Evans, Guyot Session2 Blomer, Hallak, Bbrown, Manno Session3 Strauss, Peglar, Gervasi Sesson4 Lightning Talks

Adapting Existing Solutions Keys Values

Session1 Cidon, Manno, Evans, Guyot Session2 Blomer, Hallak, Bbrown, Manno Session3 Strauss, Peglar, Gervasi Sesson4 Lightning Talks

Building Custom Solutions

slide-18
SLIDE 18

Runtime Provenance Applications

May 2019 18

Thomas Pasquier Michael (Xueyuan)Han Thomas Moyer Oliveir Hermant Jean Bacon David Eyers Adam Bates

slide-19
SLIDE 19

Runtime Provenance Applications

Architecture Security Networked Systems HPC VLSI Operating Systems Programming Languages Storage Databases Scientific Computing Distributed Computing Networking Embedded Systems IoT Mobile

May 2019 19

slide-20
SLIDE 20

Provenance 101

May 2019 20

slide-21
SLIDE 21

Provenance

< 1662 Simon de Vos, Antwerp (possibly) by 1662 Guilliam I Forchoudt, Antwerp (possibly) to 1747 Jacques de Roore, The Hague 1747 - 1771 Anthonis de Groot and Stephanus de Groot, The Hague 1771 - ? Abelsz to 1779 Jacques Clemens to 1798 Supertini and Platina, Brussels to 1814 Pauwels, Brussels to 1822 Robert Saint-Victor, Paris 1822 - ? Roux to 1924 Marquise d'Aoust,&nbspFrance 1924 Galerie Georges Petit, Paris to 1940 Federico Gentili di Giuseppe,&nbspdied 1940, Paris 1940 - 1950

  • Mrs. A. Salem, Boston (Mr. Gentili di Giuseppe's daughter )

1950 - 1954 Frederick Mont and Newhouse Galleries, New York 1954 - 1961 Samuel H. Kress Foundation, New York 12/09/1961 Seattle Art Museum

Provenance 101

May 2019 21

slide-22
SLIDE 22

From Art to Computer Science

May 2019 22

Task struct Inode cache b

argv=“sort a” name=“sort modules=“pasta…” kernel=“Linux…” env=“USER…”

sort

input=sort

a

input=a

To file system

Kernel

Provenance-Aware Storage Systems

slide-23
SLIDE 23

The Big Idea in one Slide

May 2019 23

Query/Analyze Provenance Provenance Capture Query/Analyze Provenance Provenance Capture

slide-24
SLIDE 24

Digital Provenance

May 2019 24

My Dataset

Derivation

  • peration

Child dataset1

Subset

  • peration

Child dataset2 Raw data collected from system Program

Program execution

slide-25
SLIDE 25

Digital Provenance

May 2019 25

My Dataset

Derivation

  • peration

Child dataset1

Subset

  • peration

Child dataset2 Raw data collected from system Program

Program execution

Agent

slide-26
SLIDE 26

Digital Provenance

May 2019 26

My Dataset

Derivation

  • peration

Child dataset1

Subset

  • peration

Child dataset2 Raw data collected from system Program

Program execution

Agent Activities

slide-27
SLIDE 27

Digital Provenance

May 2019 27

My Dataset

Derivation

  • peration

Child dataset1

Subset

  • peration

Child dataset2 Raw data collected from system Program

Program execution

Agent Activities Entities

slide-28
SLIDE 28

Digital Provenance

May 2019 28

My Dataset

Derivation

  • peration

Child dataset1

Subset

  • peration

Child dataset2 Raw data collected from system Program

Program execution

Relationships

slide-29
SLIDE 29

All Kinds of Provenance

May 2019 29

Provenance

slide-30
SLIDE 30

CamQuery Architecture

May 2019 30

OS Query/Analyze Provenance Provenance Capture OS Provenance Capture CamFlow LSM NetFilter Query/Analyze Provenance Delay

slide-31
SLIDE 31

OS

CamQuery Architecture

May 2019 31

OS Query/Analyze Provenance Provenance Capture CamFlow Query/Analyze Provenance LSM NetFilter Delay

slide-32
SLIDE 32

CamQuery Architecture

May 2019 32

OS Query/Analyze Provenance Provenance Capture OS CamFlow LSM

Query/Analyze Provenance

NetFilter Delay

slide-33
SLIDE 33

Architectural Implications

May 2019 33

Conventional Provenance Applications CamQuery Provenance Applications Streaming Graph Analysis Static Graph Analysis Prevention Detection F(Function) F(Graph) Mutable Immutable

slide-34
SLIDE 34

Sample Applications

May 2019 34

Information Flow

slide-35
SLIDE 35

Writing an Application

May 2019 35 #define KERNEL_QUERY #include “include/camquery.h” static label_t confidential; static void init(void) confidential = get_label(“confidential”); } static int out_edge(union prov_msg *node, union prov_msg *edge) { switch (edge_type(edge)) { case RL_WRITE: case RL_READ: case RL_SND: case RL_RCV: case RL_VERSION: case RL_VERSION_PROCESS: case RL_CLONE: if (has_label(node, confidential)) add_label(edge, confidential); } return 0; } static int in_edge(union prov_msg *edge, union prov_msg * node) { if (has_label(edge, confidential)) { add_label(node, confidential); if (node_type(node) == ENT_INODE_SOCKET) return PROV_RAISE_WARNING; } return 0; } QUERY_NAME(“Propagate labels”); QUERY_DESCRIPTION(“Example query”); QUERY_AUTHOR(“Not me.”); QUERY_VERSION(“0.1”); QUERY_LICENSE(“GPL”); register_query(init, in_edge, out_edge);

slide-36
SLIDE 36

Writing an Application

May 2019 36 #define KERNEL_QUERY #include “include/camquery.h” static label_t confidential; static void init(void) confidential = get_label(“confidential”); } static int out_edge(union prov_msg *node, union prov_msg *edge) { switch (edge_type(edge)) { case RL_WRITE: case RL_READ: case RL_SND: case RL_RCV: case RL_VERSION: case RL_VERSION_PROCESS: case RL_CLONE: if (has_label(node, confidential)) add_label(edge, confidential); } return 0; } static int in_edge(union prov_msg *edge, union prov_msg * node) { if (has_label(edge, confidential)) { add_label(node, confidential); if (node_type(node) == ENT_INODE_SOCKET) return PROV_RAISE_WARNING; } return 0; } QUERY_NAME(“Propagate labels”); QUERY_DESCRIPTION(“Example query”); QUERY_AUTHOR(“Not me.”); QUERY_VERSION(“0.1”); QUERY_LICENSE(“GPL”); register_query(init, in_edge, out_edge);

slide-37
SLIDE 37

Writing an Application

May 2019 37 #define KERNEL_QUERY #include “include/camquery.h” static label_t confidential; static void init(void) confidential = get_label(“confidential”); } static int out_edge(union prov_msg *node, union prov_msg *edge) { switch (edge_type(edge)) { case RL_WRITE: case RL_READ: case RL_SND: case RL_RCV: case RL_VERSION: case RL_VERSION_PROCESS: case RL_CLONE: if (has_label(node, confidential)) add_label(edge, confidential); } return 0; } static int in_edge(union prov_msg *edge, union prov_msg * node) { if (has_label(edge, confidential)) { add_label(node, confidential); if (node_type(node) == ENT_INODE_SOCKET) return PROV_RAISE_WARNING; } return 0; } QUERY_NAME(“Propagate labels”); QUERY_DESCRIPTION(“Example query”); QUERY_AUTHOR(“Not me.”); QUERY_VERSION(“0.1”); QUERY_LICENSE(“GPL”); register_query(init, in_edge, out_edge);

slide-38
SLIDE 38

Writing an Application

May 2019 38 #define KERNEL_QUERY #include “include/camquery.h” static label_t confidential; static void init(void) confidential = get_label(“confidential”); } static int out_edge(union prov_msg *node, union prov_msg *edge) { switch (edge_type(edge)) { case RL_WRITE: case RL_READ: case RL_SND: case RL_RCV: case RL_VERSION: case RL_VERSION_PROCESS: case RL_CLONE: if (has_label(node, confidential)) add_label(edge, confidential); } return 0; } static int in_edge(union prov_msg *edge, union prov_msg * node) { if (has_label(edge, confidential)) { add_label(node, confidential); if (node_type(node) == ENT_INODE_SOCKET) return PROV_RAISE_WARNING; } return 0; } QUERY_NAME(“Propagate labels”); QUERY_DESCRIPTION(“Example query”); QUERY_AUTHOR(“Not me.”); QUERY_VERSION(“0.1”); QUERY_LICENSE(“GPL”); register_query(init, in_edge, out_edge);

slide-39
SLIDE 39

Writing an Application

May 2019 39 #define KERNEL_QUERY #include “include/camquery.h” static label_t confidential; static void init(void) confidential = get_label(“confidential”); } static int out_edge(union prov_msg *node, union prov_msg *edge) { switch (edge_type(edge)) { case RL_WRITE: case RL_READ: case RL_SND: case RL_RCV: case RL_VERSION: case RL_VERSION_PROCESS: case RL_CLONE: if (has_label(node, confidential)) add_label(edge, confidential); } return 0; } static int in_edge(union prov_msg *edge, union prov_msg * node) { if (has_label(edge, confidential)) { add_label(node, confidential); if (node_type(node) == ENT_INODE_SOCKET) return PROV_RAISE_WARNING; } return 0; } QUERY_NAME(“Propagate labels”); QUERY_DESCRIPTION(“Example query”); QUERY_AUTHOR(“Not me.”); QUERY_VERSION(“0.1”); QUERY_LICENSE(“GPL”); register_query(init, in_edge, out_edge);

slide-40
SLIDE 40

CamQuery Implementation

May 2019 40

LSM NetFilter

Captures all information flow

  • CamFlow information flow patch. https://github. com/CamFlow/information-flow-patch.
  • Laurent Georget, Mathieu Jaume, Guillaume Piolle, Frédéric Tronel, and Valérie Viet Triem Tong. 2017.

Information Flow Tracking for Linux Handling Concurrent System Calls and Shared Memory. In International Conference on Software Engineering and Formal Methods. Springer, 1–16.

  • Laurent Georget, Mathieu Jaume, Frédéric Tronel, Guillaume Piolle, and Valérie Viet Triem Tong. 2017.

Verifying the reliability of operating system-level information flow control systems in Linux. In International Workshop on Formal Methods in Software Engineering (FormaliSE’17). IEEE/ACM, 10–16.

Shared State

Process

Process consumes Shared state

slide-41
SLIDE 41

CamQuery Implementation

May 2019 41

LSM NetFilter

Captures all information flow

Shared State

Process

Process transmits to shared state

  • CamFlow information flow patch. https://github. com/CamFlow/information-flow-patch.
  • Laurent Georget, Mathieu Jaume, Guillaume Piolle, Frédéric Tronel, and Valérie Viet Triem Tong. 2017.

Information Flow Tracking for Linux Handling Concurrent System Calls and Shared Memory. In International Conference on Software Engineering and Formal Methods. Springer, 1–16.

  • Laurent Georget, Mathieu Jaume, Frédéric Tronel, Guillaume Piolle, and Valérie Viet Triem Tong. 2017.

Verifying the reliability of operating system-level information flow control systems in Linux. In International Workshop on Formal Methods in Software Engineering (FormaliSE’17). IEEE/ACM, 10–16.

slide-42
SLIDE 42

Performance

May 2019 43 Thomas Pasquier, Xueyuan Han, Thomas Moyer, Adam Bates, Olivier Hermant, David Eyers, Jean Bacon, and Margo Seltzer.

  • 2018. Runtime Analysis of Whole-System Provenance. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and

Communications Security (CCS '18). ACM, New York, NY, USA, 1601-1616. DOI: https://doi.org/10.1145/3243734.3243776

Syscall slowdown relative to plain Linux Kernel Macrobenchmark Performance

slide-43
SLIDE 43

CamQuery Wrap Up

May 2019 44

OS CamFlow LSM Query/Analyze Provenance NetFilter OS Query/Analyze Provenance

slide-44
SLIDE 44

Fun with Non-Volatile Memory

May 2019 45

Keys Values

Session1 Cidon, Manno, Evans, Guyot Session2 Blomer, Hallak, Bbrown, Manno Session3 Strauss, Peglar, Gervasi Sesson4 Lightning Talks

Adapting Existing Solutions

Keys Values

Session1 Cidon, Manno, Evans, Guyot Session2 Blomer, Hallak, Bbrown, Manno Session3 Strauss, Peglar, Gervasi Sesson4 Lightning Talks

Building Custom Solutions

slide-45
SLIDE 45

46

Safe Harbor Statement

The following is intended to provide some insight into a line of research in Oracle

  • Labs. It is intended for information purposes only, and may not be incorporated

into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of

  • Oracle. Any views expressed in this presentation are my own and do not

necessarily reflect the views of Oracle.

May 2019

slide-46
SLIDE 46

Fun with Non-Volatile Memory

Architecture Security Networked Systems HPC VLSI Operating Systems Programming Languages Storage Databases Scientific Computing Distributed Computing Networking Embedded Systems IoT Mobile

May 2019 47

slide-47
SLIDE 47

NVM Characteristics

48

slide-48
SLIDE 48

NVM 101

I/O Bus Memory Bus

49

slide-49
SLIDE 49

ReRAM

3D Xpoint

STT-RAM PCM

https://www.embedded.com/design/real-time-and-performance/4026000/The-future-of-scalable-STT-RAM-as-a-universal-embedded-memory https://en.wikipedia.org/wiki/Phase-change_memory http://nantero.com/technology/

Carbon Nanotubes

50

slide-50
SLIDE 50

What Happens When you try to make an in-memory KV Store persistent?

  • 1. Data structure modification is contagious
  • 2. Failure-atomic transactions are crucial
  • 3. Persistent and nonpersistent objects interact

in unexpected ways

  • 4. Concurrency is hard

51

Persistent Memcached: Bringing Legacy Code to Byte-Addressable Persistent Memory (PDF) Marathe, V., Seltzer, M., Byan, S., Harris, T. Proceedings of USENIX Workshop on Hot Topics in Storage and File Systems 2017

May 2019

slide-51
SLIDE 51

52

Lesson I: Modifications are contagious

Central Hash Table

May 2019

slide-52
SLIDE 52

53

Lesson I: Modifications are contagious

Central Hash Table LRU Cache

May 2019

slide-53
SLIDE 53

54

Lesson I: Modifications are contagious

Central Hash Table LRU Cache Slab Allocator

May 2019

slide-54
SLIDE 54

55

Lesson I: Modifications are contagious

Central Hash Table LRU Cache Slab Allocator Client Request Mgmt State Machine

May 2019

slide-55
SLIDE 55

56

Lesson I: Modifications are contagious

Central Hash Table LRU Cache Slab Allocator Client Request Mgmt State Machine Background Maintenance Threads

May 2019

slide-56
SLIDE 56

Tightly Coupled Subsystems

57

Lesson I: Modifications are contagious

Central Hash Table LRU Cache Slab Allocator Client Request Mgmt State Machine Background Maintenance Threads

May 2019

slide-57
SLIDE 57

58

Lesson II: Transactions!

Central Hash Table LRU Cache Slab Allocator Client Request Mgmt State Machine Background Maintenance Threads

Transaction

May 2019

slide-58
SLIDE 58

59

Lesson III: Persistent and nonpersistent

  • bjects interact in unexpected ways

Persistent Objects Nonpersistent Objects

May 2019

slide-59
SLIDE 59

Nonpersistent Objects

60

Lesson III: Persistent and nonpersistent

  • bjects interact in unexpected ways

Persistent Objects

May 2019

slide-60
SLIDE 60

Nonpersistent Objects

61

Lesson III: Persistent and nonpersistent

  • bjects interact in unexpected ways

Persistent Objects

May 2019

slide-61
SLIDE 61

62

Lesson IV: Concurrency is Hard

Central Hash Table LRU Cache Slab Allocator Client Request Mgmt State Machine Background Maintenance Threads

May 2019

slide-62
SLIDE 62

BULLET: A Hybrid DRAM/NVM Hash Table

May 2019 63

LRU Cache Closing the Performance Gap Between Volatile and Persistent Key-Value Stores Using Cross-Referencing Logs Huang, Y., Pavlovic, M., Marathe, V., Seltzer, M., Harris, T., Byan, S. Proceedings of the 2018 USENIX Annual Technical Conference, Boston MA, June 2018.

slide-63
SLIDE 63

BULLET: A Hybrid DRAM/NVM Hash Table

May 2019 64

LRU Cache

Front End Back End

slide-64
SLIDE 64

Challenges

May 2019 65

Performance

S

slide-65
SLIDE 65

Bullet Architecture

May 2019 66

Frontend Cache Backend Persistent Hash Table Frontend threads (log writers) Backend threads (log gleaners) Cross-referencing logs

Volatile Persistent

slide-66
SLIDE 66

Cross-Referencing Logs

May 2019 67

len klen

  • pcode

applied epoch prev Key/value

Log Record

26 2 append NO 1 NULL K1/V1

K1/V1

slide-67
SLIDE 67

Cross-Referencing Logs

May 2019 68

len klen

  • pcode

applied epoch prev Key/value

Log Record

K1/V1 K2/V1 K3/V1 K1/V2

slide-68
SLIDE 68

Cross-Referencing Logs

May 2019 69

len klen

  • pcode

applied epoch prev Key/value

Log Record

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

slide-69
SLIDE 69

Cross-Referencing Logs

May 2019 70

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

L1 L2 L3

Front end Hash table

K1/V* K2/V* K3/V*

slide-70
SLIDE 70

Applying Log Records

May 2019 71

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

L1 L2 L3

Front end Hash table

K1/V* K2/V* K3/V*

slide-71
SLIDE 71

Applying Log Records

May 2019 72

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

L1 L2 L3

Front end Hash table

K1/V1 K2/V* K3/V*

slide-72
SLIDE 72

Applying Log Records

May 2019 73

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

L1 L2 L3

Front end Hash table

K1/V2 K2/V* K3/V*

slide-73
SLIDE 73

Applying Log Records

May 2019 74

K1/V1 K2/V1 K3/V1 K1/V2 K3/V2 K3/V3 K2/V2 K1/V3

L1 L2 L3

Front end Hash table

K1/V3 K2/V* K3/V*

slide-74
SLIDE 74

Bullet: Read Heavy

May 2019 75

Frontend Cache Backend Persistent Hash Table Frontend threads (log writers) Backend threads (log gleaners) Cross-referencing logs

Volatile Persistent

slide-75
SLIDE 75

Bullet: Write Heavy

May 2019 76

Frontend Cache Backend Persistent Hash Table Frontend threads (log writers) Backend threads (log gleaners) Cross-referencing logs

Volatile Persistent

slide-76
SLIDE 76

How Does it Perform?

May 2019 77

Experimental Setup

  • 16 cores; 512 GB DRAM
  • Intel’s NVM emulator
  • Zipfian key distribution
  • Show 99%-ile latency
  • Comparison HiKV

HiKV: A Hybrid Index Key-Value Store for DRAM-NVM Memory Systems Fei Xia, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Dejun Jiang, Jin Xiong, and Ninghui Sun, Institute of Computing Technology, Chinese Academy of Sciences Proceedings of the 2017 USENIX Annual Technical Conference, Santa Clara CA, July 2017.

slide-77
SLIDE 77

How Does it Perform?

May 2019 78

Experimental Setup

  • 16 cores; 512 GB DRAM
  • Intel’s NVM emulator
  • Zipfian key distribution
  • Show 99%-ile latency
  • Comparison HiKV

HiKV: A Hybrid Index Key-Value Store for DRAM-NVM Memory Systems Fei Xia, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Dejun Jiang, Jin Xiong, and Ninghui Sun, Institute of Computing Technology, Chinese Academy of Sciences Proceedings of the 2017 USENIX Annual Technical Conference, Santa Clara CA, July 2017.

slide-78
SLIDE 78

Adaptation

May 2019 79

slide-79
SLIDE 79

Closing the Performance Gap

May 2019 80

Frontend Cache Backend Persistent Hash Table Frontend threads (log writers) Backend threads (log gleaners) Cross-referencing logs

Volatile Persistent

slide-80
SLIDE 80

Thank You!

May 2019 81

I’m looking for Postdocs and students!

Postdocs email me: mseltzer@cs.ubc.ca Graduates Students