Ganeti, the New and Arcane ganeti's best kept secrets, and exciting - - PowerPoint PPT Presentation
Ganeti, the New and Arcane ganeti's best kept secrets, and exciting - - PowerPoint PPT Presentation
Ganeti, the New and Arcane ganeti's best kept secrets, and exciting new developments Ganeti Eng Team - Google LinuxCon Japan 2014 - 2 Feb 2014 Introduction to Ganeti A cluster virtualization manager, in one slide What is Ganeti? Manage
Ganeti, the New and Arcane
ganeti's best kept secrets, and exciting new developments
Ganeti Eng Team - Google LinuxCon Japan 2014 - 2 Feb 2014
Introduction to Ganeti
A cluster virtualization manager, in one slide
What is Ganeti?
Manage clusters 1-200 of physical machines, divided in nodegroups Deploy Xen/KVM/LXC virtual machines on them Controlled via command line, REST, web interfaces · · Live migration Resiliency to failure (DRBD, Ceph, SAN/NAS, ...) Cluster balancing Ease of repairs and hardware swaps
- ·
4/53
Newest features
Development status
2.10
The very stable release
Improved upgrade procedure "gnt-cluster upgrade" CPU Load in hail/hbal (GSOC project) Hotplug support (KVM) RBD storage direct access (KVM) Better Openvswitch support (GSOC project) · · · · ·
6/53
2.11
The latest stable release
Faster instance moves GlusterFS support hsqueeze (achieve maximum cluster compaction) · · ·
7/53
2.12 and future
The next stable release(s)
Jobs as processes New install model More secure master candidates Better container support (GSOC) Resource reservation/Extra parallelization Generic conversion between disk templates (GSOC) · · · · · ·
8/53
Monitoring daemon
What's going on in your cluster?
Monitoring a cluster
The old school way
Cluster
Master Node Instance Storage NICs Monitoring System Other Systems
10/53
Monitoring a cluster
Using the monitoring daemon
Cluster
Monitoring System Other Systems
Monitoring Daemons
11/53
What is the monitoring daemon?
Provides information: design doc: design-monitoring-agent.rst about the cluster state/health live read-only · · ·
12/53
More details
HTTP daemon Replying to REST-like queries Providing JSON replies Running on every node (Not: only master-candidates, VM-enabled) Additionally: mon-collector: quick 'n dirty CLI tool · · Actually, GET only · · Easy to parse in any language Already used in all the rest of Ganeti · · · ·
13/53
Data collectors
provide data to the deamon
- ne collector, one report
- ne collector, one category:
two kinds: performance reporting, status reporting new feature: stateful data collectors · · · storage, hypervisor, daemon, instance
- ·
·
14/53
Data collectors
What data can be retrieved right now?
Now: Soon(-ish): instance status (Xen only) (category: instance) diskstats information (storage) LVM logical volumes information (storage) DRBD status information (storage) Node OS CPU load average (no category, default) · · · · · instance status for KVM (instance) Ganeti daemons status (daemon) Hypervisor resources (hypervisor) Node OS resources report (default) · · · ·
15/53
The report format
{ "name" : "TheCollectorIdentifier", "version" : "1.2", "format_version" : 1, "timestamp" : 1351607182000000000, "category" : null, "kind" : 0, "data" : { "plugin_specific_data" : "go_here" } }
JSON
name: the name of the plugin. Unique string. version: the version of the plugin. A string. format_version: the version of the data format of the plugin. Incremental
integer.
timestamp: when the report was produced. Nanoseconds. Can be zero-
padded. · · · ·
16/53
Status reporting collectors: report
They introduce a mandatory part inside the data section.
"data" : { ... "status" : { "code" : <value> "message: "some summary goes here" } }
JSON
<value>:by increasing criticality level
· 0: working as intended 1: temporarily wrong. Being auto-repaired 2: unknown. Potentially dangerous state 4: problems. External intervention required · · · ·
17/53
How to use the daemon?
Accepts HTTP connections on node.example.com:1815 GET requests to specific addresses Each address returns different info according to the API · Not authenticated: read only Just firewall, or bind on local address only · · · ·
/ (return the list of supported protocol version) /1/list/collectors /1/report/all /1/report/[category]/[collector_name]
18/53
Configuration Daemon (confd)
How's your cluster supposed to look like?
Before confd
Configuration only available on master candidates Few selected values replicated with ssconf Need for a way to access config from other nodes · · Small pieces of config in text files on all the nodes Doesn't scale · · · Scalable No single point of failure (so, no RAPI) · ·
20/53
What does confd do?
Provides information from config.data Read-only Distributed Optional · · · Multiple daemons running on master candidates Accessible from all the nodes through confd protocol Resilient to failures · · · ·
21/53
What info does it provide?
Replies to simple queries: Ping Master IP Node role Node primary IP Master candidates primary IPs Instance IPs Node primary IP from Instance primary IP Node DRBD minors Node instances · · · · · · · · ·
22/53
confd protocol
General description
UDP (port 1814) keyed-Hash Message Authentication Code (HMAC) authentication Timestamp Queries made to any subset of master candidates Timeout Maximum number of expected replies · · Pre-shared, cluster wide key Generated at cluster-init Root-only readable · · · · Checked (± 2.5 mins) to prevent replay attacks Used as HMAC salt · · · · ·
23/53
Confd protocol
Request/Reply
request request request request request
24/53
Confd protocol
Request/Reply
reply (v: 56) timeout reply (v: 57) (enough replies) reply (v: 57) reply (v: 57)
25/53
confd protocol
Request
plj0{ "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" }
CONFD
plj0: fourcc detailing the message content (PLain Json 0) hmac: HMAC signature of salt+msg with the cluster hmac key
· ·
26/53
confd protocol
Request
plj0{ "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" }
CONFD
msg: JSON-encoded query
·
protocol: confd protocol version (=1) type: What to ask for (CONFD_REQ_* constants) query: additional parameters rsalt: response salt == UUID identifying the request
· · · ·
27/53
confd protocol
Reply
plj0{ "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" }
CONFD
salt: the rsalt of the query hmac: hmac signature of salt+msg
· ·
28/53
confd protocol
Reply
plj0{ "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" }
CONFD
msg: JSON-encoded answer
·
protocol: protocol version (=1) status: 0=ok; 1=error answer: query-specific reply serial: version of config.data
· · · ·
29/53
Ready-made clients
The protocol is simple, but clients are simpler Ready to use confd clients · Python Haskell ·
lib/confd/client.py
· · Since Ganeti 2.7
src/Ganeti/ConfD/Client.hs src/Ganeti/ConfD/ClientFunctions.hs
· · ·
30/53
Expanding confd capabilities
Currently not so many queries are supported Easy to add new ones · · Just add a new query type in the constants list ...and extend the buildResponse function (src/Ganeti/Confd/Server.hs to reply to it in the appropriate way · ·
31/53
Ganeti and Networks
How do your instances talk to the world?
Some slides contributed by Dimitris Aragiorgis <dimara@grnet.gr> ·
current nics: MAC + IP + link + mode
NIC configuration Management
mode=bridged uses brctl addif
Hooks can deal with firewall rules, and more External systems needed for DHCP, IPv6, etc. · · · Which VMs are on the same collision domain? Which IP is free for a new VM to use? · ·
33/53
gnt-network overview
manage collision domains for your instances easy way to assign IPs to instances keep existing per-nic flexibility hide underlying infrastructure better networking overview · · If resources are shared in multiple clusters, allocation must be done externally
- ·
· ·
34/53
gnt-network: Who does what?
masterd: config.data integrity external scripts and hooks: ping vm1.ganeti.example.com abstract network infrastructure: network + netparams per nodegroup IP uniqueness inside network: IP pool management encapsulate network information in NIC opjects: RPC · · bitarray, TemporaryReservationmanager, Locking
- ·
use exported environment provided by noded
brctl, iptables, ebtables, ip rule, etc.
update external dhcp/DNS server entries let VM act unaware of the "situation" (dhclient, etc.) · · · ·
35/53
gnt-network + external scripts
gnt-network alone is nothing more than a nice config.data snf-network: node level scripts and hooks nfdhcpd: node level DHCP server based on NFQUEUE
· · ·
36/53
snf-network
node level scripts and hooks
- verrides Ganeti default scripts (kvm-ifup, vif-ganeti)
looks for specific tag types in NIC's network applies corresponding rules created nfdhcpd binding files provides hook to update DNS entries · · · · ·
37/53
nfdhcpd
node level DHCP server based on NFQUEUE
listens on specific NFQEUE updates its leases db mangles DHCP requests and replies based on it's db responds to RS and NS for IPv6 auto-configuration · ·
inotify on specific directory for binding files
- ·
·
38/53
gnt-network
Examples
Create and connect a new network Create an instance inside this network
gnt-network add --network 192.168.1.0/24 --gateway 192.168.1.1 --tags nfdhcpd net1 gnt-network connect net1 bridged prv0 gnt-instance add --net 0:ip=pool,network=net1 ... inst1 gnt-instance info inst1 gnt-network info net1
39/53
gnt-network + snf-*
Examples
Use snf-network and nfdhcpd Test connectivity
apt-get install snf-network nfdhcdpd iptables -t mangle -A PREROUTING -i prv+ -p udp -m udp --dport 67 \
- j NFQUEUE --queue-num 42
ip addr add 192.168.1.1/24 dev prv0 gnt-instance reboot inst1 ping 192.168.1.2
40/53
References
snf-network: http://code.grnet.gr/git/snf-network nfdhcpd: http://code.grnet.gr/git/snf-nfdhcpd · ·
41/53
Ganeti ExtStorage Interface
More options for your data
Some slides contributed by Constantinos Venetsanopoulos <cven@grnet.gr> ·
State before the ExtStorage Interface
Non-mirrored templates: plain, file Internally mirrored templates: drbd Externally mirrored templates: sharedfile, rbd, blockdev, diskless · · ·
43/53
Ganeti and external SAN/NAS applicances
Instance disks residing inside an external SAN/NAS appliance visible by all Ganeti nodes (e.g. NetApp, EMC, IBM) Instances should be able to migrate/failover to any node that can access the appliance. Ganeti should integrate with external SAN/NAS appliances in a generic way, independent of the appliance itself in the easiest possible way from the admin's perspective. · · ·
44/53
Introducing the 'ExtStorage Interface'
A simple interface inspired by the Ganeti OS interface To plug an appliance to Ganeti, we need a corresponding 'ExtStorage provider' which is a set of scripts residing under a directory. e.g. /usr/share/ganeti/extstorage/provider1/ · · ·
45/53
ExtStorage provider methods
Every ExtStorage provider should provide the following methods: Create a disk on the appliance Remove a disk from the appliance Grow a disk on the appliance Attach a disk to a given Ganeti node Detach a disk from a given Ganeti node SetInfo on a disk (add metadata) Verify the provider's supported parameters · · · · · · ·
46/53
ExtStorage provider scripts
The methods are implemented in the corresponding 7 executable scripts, using appliance-specific tools:
attach returns a block device path on success
Input via environment variables, e.g. VOL_NAME, VOL_SIZE
# ls -l /usr/share/ganeti/extstorage/provider1 create remove grow attach detach setinfo verify
47/53
The new 'ext' template
Introduce a new externally mirrored disk template: ext Introduce a new disk option: provider · ·
48/53
Using the interface
Example
Assuming two appliances visible by a Ganeti cluster and their two ExtStorage providers installed on all Ganeti nodes:
/usr/share/ganeti/extstorage/emc/* /usr/share/ganeti/extstorage/ibm/* # gnt-instance add -t ext --disk=0:size=2G,provider=emc # gnt-instance add -t ext --disk=0:size=2G,provider=emc \
- -disk=1:size=1G,provider=emc \
- -disk=2:size=10G,provider=ibm
# gnt-instance modify --disk 3:add,size=20G,provider=ibm # gnt-instance migrate testvm1 # gnt-instance migrate n- nodeX.example.com testvm1
49/53
ExtStorage Interface dynamic parameters
Support for dynamic passing of arbitrary parameters to ExtStorage providers during instance creation/modification per-disk: The above parameters will be exported to the ExtStorage provider's scripts as environment variables:
# gnt-instance add -t ext --disk=0:size=2G,provider=emc,param1=value1,param2=value2
- -disk=1:size=10G,provider=ibm,param3=value3,param4=value4
# gnt-instance modify --disk 2:add,size=3G,provider=emc,param5=value5 EXTP_PARAM1 = str(value1) EXTP_PARAM2 = str(value2) ...
50/53
The new 'gnt-storage' client
Inspired by gnt-os:
# gnt-storage diagnose # gnt-storage info
51/53
Some images borrowed / modified from Lance Albertson, Iustin Pop,