Ganeti, the New and Arcane ganeti's best kept secrets, and exciting - - PowerPoint PPT Presentation

ganeti the new and arcane
SMART_READER_LITE
LIVE PREVIEW

Ganeti, the New and Arcane ganeti's best kept secrets, and exciting - - PowerPoint PPT Presentation

Ganeti, the New and Arcane ganeti's best kept secrets, and exciting new developments Ganeti Eng Team - Google LinuxCon Japan 2014 - 2 Feb 2014 Introduction to Ganeti A cluster virtualization manager, in one slide What is Ganeti? Manage


slide-1
SLIDE 1
slide-2
SLIDE 2

Ganeti, the New and Arcane

ganeti's best kept secrets, and exciting new developments

Ganeti Eng Team - Google LinuxCon Japan 2014 - 2 Feb 2014

slide-3
SLIDE 3

Introduction to Ganeti

A cluster virtualization manager, in one slide

slide-4
SLIDE 4

What is Ganeti?

Manage clusters 1-200 of physical machines, divided in nodegroups Deploy Xen/KVM/LXC virtual machines on them Controlled via command line, REST, web interfaces · · Live migration Resiliency to failure (DRBD, Ceph, SAN/NAS, ...) Cluster balancing Ease of repairs and hardware swaps

  • ·

4/53

slide-5
SLIDE 5

Newest features

Development status

slide-6
SLIDE 6

2.10

The very stable release

Improved upgrade procedure "gnt-cluster upgrade" CPU Load in hail/hbal (GSOC project) Hotplug support (KVM) RBD storage direct access (KVM) Better Openvswitch support (GSOC project) · · · · ·

6/53

slide-7
SLIDE 7

2.11

The latest stable release

Faster instance moves GlusterFS support hsqueeze (achieve maximum cluster compaction) · · ·

7/53

slide-8
SLIDE 8

2.12 and future

The next stable release(s)

Jobs as processes New install model More secure master candidates Better container support (GSOC) Resource reservation/Extra parallelization Generic conversion between disk templates (GSOC) · · · · · ·

8/53

slide-9
SLIDE 9

Monitoring daemon

What's going on in your cluster?

slide-10
SLIDE 10

Monitoring a cluster

The old school way

Cluster

Master Node Instance Storage NICs Monitoring System Other Systems

10/53

slide-11
SLIDE 11

Monitoring a cluster

Using the monitoring daemon

Cluster

Monitoring System Other Systems

Monitoring Daemons

11/53

slide-12
SLIDE 12

What is the monitoring daemon?

Provides information: design doc: design-monitoring-agent.rst about the cluster state/health live read-only · · ·

12/53

slide-13
SLIDE 13

More details

HTTP daemon Replying to REST-like queries Providing JSON replies Running on every node (Not: only master-candidates, VM-enabled) Additionally: mon-collector: quick 'n dirty CLI tool · · Actually, GET only · · Easy to parse in any language Already used in all the rest of Ganeti · · · ·

13/53

slide-14
SLIDE 14

Data collectors

provide data to the deamon

  • ne collector, one report
  • ne collector, one category:

two kinds: performance reporting, status reporting new feature: stateful data collectors · · · storage, hypervisor, daemon, instance

  • ·

·

14/53

slide-15
SLIDE 15

Data collectors

What data can be retrieved right now?

Now: Soon(-ish): instance status (Xen only) (category: instance) diskstats information (storage) LVM logical volumes information (storage) DRBD status information (storage) Node OS CPU load average (no category, default) · · · · · instance status for KVM (instance) Ganeti daemons status (daemon) Hypervisor resources (hypervisor) Node OS resources report (default) · · · ·

15/53

slide-16
SLIDE 16

The report format

{ "name" : "TheCollectorIdentifier", "version" : "1.2", "format_version" : 1, "timestamp" : 1351607182000000000, "category" : null, "kind" : 0, "data" : { "plugin_specific_data" : "go_here" } }

JSON

name: the name of the plugin. Unique string. version: the version of the plugin. A string. format_version: the version of the data format of the plugin. Incremental

integer.

timestamp: when the report was produced. Nanoseconds. Can be zero-

padded. · · · ·

16/53

slide-17
SLIDE 17

Status reporting collectors: report

They introduce a mandatory part inside the data section.

"data" : { ... "status" : { "code" : <value> "message: "some summary goes here" } }

JSON

<value>:by increasing criticality level

· 0: working as intended 1: temporarily wrong. Being auto-repaired 2: unknown. Potentially dangerous state 4: problems. External intervention required · · · ·

17/53

slide-18
SLIDE 18

How to use the daemon?

Accepts HTTP connections on node.example.com:1815 GET requests to specific addresses Each address returns different info according to the API · Not authenticated: read only Just firewall, or bind on local address only · · · ·

/ (return the list of supported protocol version) /1/list/collectors /1/report/all /1/report/[category]/[collector_name]

18/53

slide-19
SLIDE 19

Configuration Daemon (confd)

How's your cluster supposed to look like?

slide-20
SLIDE 20

Before confd

Configuration only available on master candidates Few selected values replicated with ssconf Need for a way to access config from other nodes · · Small pieces of config in text files on all the nodes Doesn't scale · · · Scalable No single point of failure (so, no RAPI) · ·

20/53

slide-21
SLIDE 21

What does confd do?

Provides information from config.data Read-only Distributed Optional · · · Multiple daemons running on master candidates Accessible from all the nodes through confd protocol Resilient to failures · · · ·

21/53

slide-22
SLIDE 22

What info does it provide?

Replies to simple queries: Ping Master IP Node role Node primary IP Master candidates primary IPs Instance IPs Node primary IP from Instance primary IP Node DRBD minors Node instances · · · · · · · · ·

22/53

slide-23
SLIDE 23

confd protocol

General description

UDP (port 1814) keyed-Hash Message Authentication Code (HMAC) authentication Timestamp Queries made to any subset of master candidates Timeout Maximum number of expected replies · · Pre-shared, cluster wide key Generated at cluster-init Root-only readable · · · · Checked (± 2.5 mins) to prevent replay attacks Used as HMAC salt · · · · ·

23/53

slide-24
SLIDE 24

Confd protocol

Request/Reply

request request request request request

24/53

slide-25
SLIDE 25

Confd protocol

Request/Reply

reply (v: 56) timeout reply (v: 57) (enough replies) reply (v: 57) reply (v: 57)

25/53

slide-26
SLIDE 26

confd protocol

Request

plj0{ "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" }

CONFD

plj0: fourcc detailing the message content (PLain Json 0) hmac: HMAC signature of salt+msg with the cluster hmac key

· ·

26/53

slide-27
SLIDE 27

confd protocol

Request

plj0{ "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" }

CONFD

msg: JSON-encoded query

·

protocol: confd protocol version (=1) type: What to ask for (CONFD_REQ_* constants) query: additional parameters rsalt: response salt == UUID identifying the request

· · · ·

27/53

slide-28
SLIDE 28

confd protocol

Reply

plj0{ "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" }

CONFD

salt: the rsalt of the query hmac: hmac signature of salt+msg

· ·

28/53

slide-29
SLIDE 29

confd protocol

Reply

plj0{ "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" }

CONFD

msg: JSON-encoded answer

·

protocol: protocol version (=1) status: 0=ok; 1=error answer: query-specific reply serial: version of config.data

· · · ·

29/53

slide-30
SLIDE 30

Ready-made clients

The protocol is simple, but clients are simpler Ready to use confd clients · Python Haskell ·

lib/confd/client.py

· · Since Ganeti 2.7

src/Ganeti/ConfD/Client.hs src/Ganeti/ConfD/ClientFunctions.hs

· · ·

30/53

slide-31
SLIDE 31

Expanding confd capabilities

Currently not so many queries are supported Easy to add new ones · · Just add a new query type in the constants list ...and extend the buildResponse function (src/Ganeti/Confd/Server.hs to reply to it in the appropriate way · ·

31/53

slide-32
SLIDE 32

Ganeti and Networks

How do your instances talk to the world?

Some slides contributed by Dimitris Aragiorgis <dimara@grnet.gr> ·

slide-33
SLIDE 33

current nics: MAC + IP + link + mode

NIC configuration Management

mode=bridged uses brctl addif

Hooks can deal with firewall rules, and more External systems needed for DHCP, IPv6, etc. · · · Which VMs are on the same collision domain? Which IP is free for a new VM to use? · ·

33/53

slide-34
SLIDE 34

gnt-network overview

manage collision domains for your instances easy way to assign IPs to instances keep existing per-nic flexibility hide underlying infrastructure better networking overview · · If resources are shared in multiple clusters, allocation must be done externally

  • ·

· ·

34/53

slide-35
SLIDE 35

gnt-network: Who does what?

masterd: config.data integrity external scripts and hooks: ping vm1.ganeti.example.com abstract network infrastructure: network + netparams per nodegroup IP uniqueness inside network: IP pool management encapsulate network information in NIC opjects: RPC · · bitarray, TemporaryReservationmanager, Locking

  • ·

use exported environment provided by noded

brctl, iptables, ebtables, ip rule, etc.

update external dhcp/DNS server entries let VM act unaware of the "situation" (dhclient, etc.) · · · ·

35/53

slide-36
SLIDE 36

gnt-network + external scripts

gnt-network alone is nothing more than a nice config.data snf-network: node level scripts and hooks nfdhcpd: node level DHCP server based on NFQUEUE

· · ·

36/53

slide-37
SLIDE 37

snf-network

node level scripts and hooks

  • verrides Ganeti default scripts (kvm-ifup, vif-ganeti)

looks for specific tag types in NIC's network applies corresponding rules created nfdhcpd binding files provides hook to update DNS entries · · · · ·

37/53

slide-38
SLIDE 38

nfdhcpd

node level DHCP server based on NFQUEUE

listens on specific NFQEUE updates its leases db mangles DHCP requests and replies based on it's db responds to RS and NS for IPv6 auto-configuration · ·

inotify on specific directory for binding files

  • ·

·

38/53

slide-39
SLIDE 39

gnt-network

Examples

Create and connect a new network Create an instance inside this network

gnt-network add --network 192.168.1.0/24 --gateway 192.168.1.1 --tags nfdhcpd net1 gnt-network connect net1 bridged prv0 gnt-instance add --net 0:ip=pool,network=net1 ... inst1 gnt-instance info inst1 gnt-network info net1

39/53

slide-40
SLIDE 40

gnt-network + snf-*

Examples

Use snf-network and nfdhcpd Test connectivity

apt-get install snf-network nfdhcdpd iptables -t mangle -A PREROUTING -i prv+ -p udp -m udp --dport 67 \

  • j NFQUEUE --queue-num 42

ip addr add 192.168.1.1/24 dev prv0 gnt-instance reboot inst1 ping 192.168.1.2

40/53

slide-41
SLIDE 41

References

snf-network: http://code.grnet.gr/git/snf-network nfdhcpd: http://code.grnet.gr/git/snf-nfdhcpd · ·

41/53

slide-42
SLIDE 42

Ganeti ExtStorage Interface

More options for your data

Some slides contributed by Constantinos Venetsanopoulos <cven@grnet.gr> ·

slide-43
SLIDE 43

State before the ExtStorage Interface

Non-mirrored templates: plain, file Internally mirrored templates: drbd Externally mirrored templates: sharedfile, rbd, blockdev, diskless · · ·

43/53

slide-44
SLIDE 44

Ganeti and external SAN/NAS applicances

Instance disks residing inside an external SAN/NAS appliance visible by all Ganeti nodes (e.g. NetApp, EMC, IBM) Instances should be able to migrate/failover to any node that can access the appliance. Ganeti should integrate with external SAN/NAS appliances in a generic way, independent of the appliance itself in the easiest possible way from the admin's perspective. · · ·

44/53

slide-45
SLIDE 45

Introducing the 'ExtStorage Interface'

A simple interface inspired by the Ganeti OS interface To plug an appliance to Ganeti, we need a corresponding 'ExtStorage provider' which is a set of scripts residing under a directory. e.g. /usr/share/ganeti/extstorage/provider1/ · · ·

45/53

slide-46
SLIDE 46

ExtStorage provider methods

Every ExtStorage provider should provide the following methods: Create a disk on the appliance Remove a disk from the appliance Grow a disk on the appliance Attach a disk to a given Ganeti node Detach a disk from a given Ganeti node SetInfo on a disk (add metadata) Verify the provider's supported parameters · · · · · · ·

46/53

slide-47
SLIDE 47

ExtStorage provider scripts

The methods are implemented in the corresponding 7 executable scripts, using appliance-specific tools:

attach returns a block device path on success

Input via environment variables, e.g. VOL_NAME, VOL_SIZE

# ls -l /usr/share/ganeti/extstorage/provider1 create remove grow attach detach setinfo verify

47/53

slide-48
SLIDE 48

The new 'ext' template

Introduce a new externally mirrored disk template: ext Introduce a new disk option: provider · ·

48/53

slide-49
SLIDE 49

Using the interface

Example

Assuming two appliances visible by a Ganeti cluster and their two ExtStorage providers installed on all Ganeti nodes:

/usr/share/ganeti/extstorage/emc/* /usr/share/ganeti/extstorage/ibm/* # gnt-instance add -t ext --disk=0:size=2G,provider=emc # gnt-instance add -t ext --disk=0:size=2G,provider=emc \

  • -disk=1:size=1G,provider=emc \
  • -disk=2:size=10G,provider=ibm

# gnt-instance modify --disk 3:add,size=20G,provider=ibm # gnt-instance migrate testvm1 # gnt-instance migrate n- nodeX.example.com testvm1

49/53

slide-50
SLIDE 50

ExtStorage Interface dynamic parameters

Support for dynamic passing of arbitrary parameters to ExtStorage providers during instance creation/modification per-disk: The above parameters will be exported to the ExtStorage provider's scripts as environment variables:

# gnt-instance add -t ext --disk=0:size=2G,provider=emc,param1=value1,param2=value2

  • -disk=1:size=10G,provider=ibm,param3=value3,param4=value4

# gnt-instance modify --disk 2:add,size=3G,provider=emc,param5=value5 EXTP_PARAM1 = str(value1) EXTP_PARAM2 = str(value2) ...

50/53

slide-51
SLIDE 51

The new 'gnt-storage' client

Inspired by gnt-os:

# gnt-storage diagnose # gnt-storage info

51/53

slide-52
SLIDE 52

Some images borrowed / modified from Lance Albertson, Iustin Pop,