Linux on Sun Logical Domains David S. Miller Red Hat Inc. - - PowerPoint PPT Presentation

linux on sun logical domains
SMART_READER_LITE
LIVE PREVIEW

Linux on Sun Logical Domains David S. Miller Red Hat Inc. - - PowerPoint PPT Presentation

Background Userland Simulator Implementation Challenges/Futures Summary Linux on Sun Logical Domains David S. Miller Red Hat Inc. linux.conf.au, MEL8OURNE, 2008 David S. Miller Red Hat Inc. Linux on Sun LDOMs Background Userland


slide-1
SLIDE 1

Background Userland Simulator Implementation Challenges/Futures Summary

Linux on Sun Logical Domains

David S. Miller

Red Hat Inc.

linux.conf.au, MEL8OURNE, 2008

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-2
SLIDE 2

Background Userland Simulator Implementation Challenges/Futures Summary

Outline

1

Background SUN4V and Niagara Sun’s Logical Domains

2

Userland Simulator

3

Implementation LDC: Logical Domain Channels VIO: Virtual I/O DS: Domain Services VNET: Virtual Network VDC: Virtual Disk Client Console

4

Challenges/Futures

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-3
SLIDE 3

Background Userland Simulator Implementation Challenges/Futures Summary SUN4V and Niagara

Niagara: All Virtual, All the Time

The “V” in SUN4V stands for Virtualized Most of the hardware is only hypervisor accessible, even

  • n a non-virtualized node.

Supervisor makes hypercalls using software traps. Supervisor only sees real addresses. I/O devices behind PCI, however can be directly programmed

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-4
SLIDE 4

Background Userland Simulator Implementation Challenges/Futures Summary SUN4V and Niagara

Niagara: 64-bit Sparc traps

Traps vectored as offset from Trap Base Address Register. Each trap slot is 8 instructions (32 bytes). Extremely simple traps done inline. More complicated work branches out to rest of handler. “Very Important” traps given multiple slots (f.e. TLB misses) Half of trap table for hardware exceptions, half for SW traps. SW traps are for system calls etc. Special SW traps are used for hypercalls.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-5
SLIDE 5

Background Userland Simulator Implementation Challenges/Futures Summary SUN4V and Niagara

Niagara: Hypercalls

Looks like a system call. Arguments passed in outgoing argument registers (o0-o4). Hypercall number passed in o5. Status always returned in o0.

  • 1-o5 can provide other return value state.

mov cpuid, %o0 mov HV_FAST_CPU_STOP, %o5 ta HV_FAST_TRAP cmp %o0, HV_EOK bne cpu_stop_error nop

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-6
SLIDE 6

Background Userland Simulator Implementation Challenges/Futures Summary SUN4V and Niagara

Niagara: Fast Hypercalls

Dedicated SW trap vector No need to indicate call in o5, available for args Used for TLB load/flush and trap tracing. mov vaddr, %o0 mov tlb_context, %o1 mov pte, %o2 mov HV_MMU_IMMU, %o3 ta HV_MMU_MAP_ADDR_TRAP cmp %o0, HV_EOK bne itlb_load_error nop

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-7
SLIDE 7

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDOM Node types

1

Control node: has full access to devices and primary console.

2

Service node: has access to some physical devices.

3

Guest node: has only virtualized devices.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-8
SLIDE 8

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

MD: Machine Description

Complete logical description of machine the node executes

  • n.

Provided by hypervisor as a compact datastructure. Stored on the ALOM/ILOM. Dynamically updated. Control node constructs MDs for service and guest nodes.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-9
SLIDE 9

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDC: Logical Domain Channel

Communications link between nodes, via hypervisor. Bidirectional communications path, each end of the channel establishes a receive and transmit queue. Simple fixed sized, 64-byte, packets. Initial handshake establishes protocol version and synchronizes connection. If receive queue of either endpoint is unregistered, this resets the channel.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-10
SLIDE 10

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDC: Packet format

1

type: indicates control, data, error

2

stype: indicates INFO, ACK, NACK

3

ctrl: indicates type of control packet

4

env: gives fragmentation state

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-11
SLIDE 11

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDC: Map Table Entries

Allows memory transfers between nodes. Similar to MMU or IOMMU PTE. Provides for transfer type protection.

1

COPY: read and write

2

IOMMU: read and write

3

MMU: exec read and write

LDC COPY operations have alignment restrictions.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-12
SLIDE 12

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

VIO: Virtual I/O

I/O protocol built on top of channels. Just like LDC, has a handshake to synchronize, negotiate protocol versions, and to negotiate I/O parameters. Definitions exist for block, network, and console devices.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-13
SLIDE 13

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

DS: Domain Services

Miscellaneous communications, again built on top of channels. Remote reboot of guests. CPU hotplug. Machine description updates. Setting persistent firmware variables such as the boot device.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-14
SLIDE 14

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDC: Example System

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-15
SLIDE 15

Background Userland Simulator Implementation Challenges/Futures Summary Sun’s Logical Domains

LDC: Zooming In

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-16
SLIDE 16

Background Userland Simulator Implementation Challenges/Futures Summary

Purpose

Userland is great for fast prototyping and debugging. Userland “reboots” faster. I had ethical issues with installing Solaris on my computers But I’m over that now...

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-17
SLIDE 17

Background Userland Simulator Implementation Challenges/Futures Summary

Implementation

Software implementation of all LDC hypervisor calls. Use same C interfaces as the kernel does. LDC protocol module could be compiled both in userland and kernel. Subsequently, VIO layer built on top could be just as flexible. Problem: Initially only compatible with itself.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-18
SLIDE 18

Background Userland Simulator Implementation Challenges/Futures Summary

TX Interfaces

unsigned long sun4v_ldc_tx_qconf(unsigned long id, unsigned long ra, unsigned long num_entries); unsigned long sun4v_ldc_tx_qinfo(unsigned long id, unsigned long *ra, unsigned long *num_entries); unsigned long sun4v_ldc_tx_get_state(unsigned long id, unsigned long *head, unsigned long *tail, unsigned long *state); unsigned long sun4v_ldc_tx_set_qtail(unsigned long id, unsigned long tail); David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-19
SLIDE 19

Background Userland Simulator Implementation Challenges/Futures Summary

RX Interfaces

unsigned long sun4v_ldc_rx_qconf(unsigned long id, unsigned long ra, unsigned long num_entries); unsigned long sun4v_ldc_rx_qinfo(unsigned long id, unsigned long *ra, unsigned long *num_entries); unsigned long sun4v_ldc_rx_get_state(unsigned long id, unsigned long *head, unsigned long *tail, unsigned long *state); unsigned long sun4v_ldc_rx_set_qhead(unsigned long id, unsigned long head); David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-20
SLIDE 20

Background Userland Simulator Implementation Challenges/Futures Summary LDC: Logical Domain Channels

Client LDC Interfaces, Part 1

Clients work with opaque “ldc channel” object. Creation, destruction, and state management.

1

Allocate

2

Free

3

Bind

4

Connect

5

Disconnect

6

Get current state

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-21
SLIDE 21

Background Userland Simulator Implementation Challenges/Futures Summary LDC: Logical Domain Channels

Client LDC Interfaces, Part 2

Data Transfer

1

Write

2

Read

Mapping Translation Management

1

Map SG, Map Single

2

Unmap

3

Copy

4

DRING Alloc and Free helpers (for VIO)

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-22
SLIDE 22

Background Userland Simulator Implementation Challenges/Futures Summary VIO: Virtual I/O

Virtual Device Layer

Tree of “struct vio_dev” nodes. Dummy root, all virtual devices underneath. Populated by machine description notifier.

1

Notifier registration triggers MD add events.

2

All initial devices created.

3

Future hot-plug triggers MD add/remove.

Infrastructure closely mimicks powerpc VIO layer.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-23
SLIDE 23

Background Userland Simulator Implementation Challenges/Futures Summary VIO: Virtual I/O

VIO Device Properties

Three properties in MDESC node for VIO device. LDC channel ID LDC RX interrupt LDC TX interrupt Device type specific properties

1

Network MAC address, port type

2

Device Number, mainly for disks

3

Etc.

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-24
SLIDE 24

Background Userland Simulator Implementation Challenges/Futures Summary VIO: Virtual I/O

VIO Driver Helpers

Driver Init: validate config and setup helper state LDC Alloc: Allocated LDC channel and records state LDC Free: Shut down LDC channel and free state (incl. DRINGS) LDC Port Up: Bring LDC port up, retrying periodically Handshake Engine: Runs handshake using driver callbacks LDC Link State: Bulk of link UP/DOWN work LDC Send: Looping LDC write retry with delay

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-25
SLIDE 25

Background Userland Simulator Implementation Challenges/Futures Summary VIO: Virtual I/O

VIO Driver Flow

1

vio_driver_init()

2

vio_ldc_alloc()

3

Allocate TX DRING and buffers if needed

4

Device UP: vio_port_up()

5

Port UP: Run handshake, obtain attributes

6

Send work on TX DRING using DATA+INFO

7

Process incoming TX DRING DATA+ACKs

8

Receive work on RX DRING as DATA+INFO

9

Send RX DRING work DATA+ACKs

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-26
SLIDE 26

Background Userland Simulator Implementation Challenges/Futures Summary DS: Domain Services

DS Basics

YAHS: Yet Another HandShake Packet Classes: VER, REG, UNREG, DATA Services

1

md-update: Machine Description Update

2

domain-shutdown: Remote /sbin/shutdown

3

domain-panic: Remote panic()

4

dr-cpu: CPU Hot-Plug

5

pri: Physical Resource Inventory

6

var-config: Firmware variable handling

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-27
SLIDE 27

Background Userland Simulator Implementation Challenges/Futures Summary DS: Domain Services

Salient DS Details

Two DS LDC channels

1

Primary to Control Node

2

Backup to Service Processor

PRI arrives as MDESC-like data block Firmware variables can be set and deleted Firmware variables stored on Service Processor with MDESC DS work processed in kernel thread

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-28
SLIDE 28

Background Userland Simulator Implementation Challenges/Futures Summary VNET: Virtual Network

VNET Attributes and Calls

Packet transfer mode Remote MAC address MTU Multicast list upload

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-29
SLIDE 29

Background Userland Simulator Implementation Challenges/Futures Summary VNET: Virtual Network

VNET Switch

Control node implements a switch Switch connects to guests and outside network Guests have links to switch Guests also may have links to other guests

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-30
SLIDE 30

Background Userland Simulator Implementation Challenges/Futures Summary VDC: Virtual Disk Client

VDC Attributes

Packet transfer mode Block size Maximum transfer size Bitmask of supported operation

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-31
SLIDE 31

Background Userland Simulator Implementation Challenges/Futures Summary VDC: Virtual Disk Client

VDC Specific Calls

Block read and write Flush (I/O barrier) Get/set write cache enable Get/set VTOC (disk label) Get/set EFI (disk label) Get/set geometry SCSI command submission

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-32
SLIDE 32

Background Userland Simulator Implementation Challenges/Futures Summary Console

Console: Guest And Service Node Side

Nothing to do Use normal hypervisor console write/read LDC endpoint exists internal to hypervisor Hypervisor sends/receives LDC packets

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-33
SLIDE 33

Background Userland Simulator Implementation Challenges/Futures Summary Console

Console: Control Node Side

Implements VCC, Virtual Console Concentrator Console accessed by telnetting to various ports One port per guest or service node

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-34
SLIDE 34

Background Userland Simulator Implementation Challenges/Futures Summary

Implementation Challenges

Handshake initiation VIO sequence number handling VIO disk label ownership (fixed now) VIO variable sized packet data structures

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-35
SLIDE 35

Background Userland Simulator Implementation Challenges/Futures Summary

Things TODO...

Fault tolerance of control node crash Fill in missing VDC stuff (SCSI I/O, EFI, etc.) Infrastructure for Linux as control node

1

LDC channel usage in userland

2

Control node userland daemon

3

Configuration framework

4

VCC console server

David S. Miller Red Hat Inc. Linux on Sun LDOMs

slide-36
SLIDE 36

Background Userland Simulator Implementation Challenges/Futures Summary

Summary

LDOMs is a framework for full virtualization on Niagara systems Userland prototyping of support can help enormously Linux works as a full guest node VDC, VNET, and DS implemented Specification and implementation are two different things

David S. Miller Red Hat Inc. Linux on Sun LDOMs