Datacenter Operating Systems
CSE451
Simon Peter
With thanks to Timothy Roscoe (ETH Zurich)
Autumn 2015
Operating Systems CSE451 Simon Peter With thanks to Timothy Roscoe - - PowerPoint PPT Presentation
Datacenter Operating Systems CSE451 Simon Peter With thanks to Timothy Roscoe (ETH Zurich) Autumn 2015 This Lecture Whats a datacenter Why datacenters Types of datacenters Hyperscale datacenters Major problem: Server I/O
Simon Peter
With thanks to Timothy Roscoe (ETH Zurich)
Autumn 2015
Hardware trend
Intel X520 10G NIC Intel RS3 RAID 1GB flash-backed cache Sandy Bridge CPU 6 cores, 2.2 GHz
2 us / 1KB packet 25 us / 1KB write + + =
OS problem
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 1. Interrupt
1.1 Allocate mbuf 1.2 Enqueue packet 1.3 Post s/w interrupt
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 2. S/W Interrupt
High priority IP processing TCP processing Enqueue on socket
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 3. Application Access control
Copy mbuf to user space
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 1. Application
Access control Copy from user space to mbuf Call TCP code and process Possible enqueue on socket queue
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 2. S/W Interrupt
Remaining TCP processing IP processing Enqueue on NIC queue
TCP UDP ICMP IP
Network interface
Receive queue Datagram socket Stream socket
Kernel
Application Application 3. Interrupt
Send packet Free mbuf
Kernel
Redis
HW 13%
HW 18%
Kernel 84% Kernel 62%
App 3% App 20%
SET GET
% OF 1KB REQUEST TIME SPENT
API Multiplexing Naming Resource limits Access control I/O Scheduling I/O Processing Copying Protection
10G NIC 2 us / 1KB packet RAID Storage 25 us / 1KB write 9 us 163 us
w/ own registers, queues, INTs
Devices use app virtual memory
Only allow eligible I/O
SR-IOV NIC
Packet filters Network
Rate limiters
User-level VNIC 1 User-level VNIC 2
Kernel
Naming Resource limits Access control
Redis
Redis I/O Devices
API Multiplexing I/O Scheduling I/O Processing Copying Protection
Library
Kernel
Naming Resource limits Access control
Redis
Redis I/O Devices
API Multiplexing I/O Scheduling I/O Processing Protection
Data Path Control Plane Data Plane
Virtual Storage Area
/tmp/lockfile /var/lib/key_value.db /etc/config.rc …
Kernel VFS emacs
Redis
Fast HW ops
Logical disk Indirect IPC interface
Benefits:
9 us 163 us 4 us 31 us HW 33% HW 18% libIO 35% Kernel 62% App 32% App 20%
Arrakis Linux
HW 77% HW 13% libIO 7% Kernel 84% App 15% App 3%
Arrakis Linux (ext4)
1.8x 2x 3.1x 200 400 600 800 1000 1200 1 2 4
Throughput (k transactions/s)
Number of CPU cores Linux Arrakis 10Gb/s interface limit
http://services.cs.utexas.edu/recruit/grad/frontmatter/announcement.html