Rahti container cloud service Aim of this aernoon: $ aragorn - - PowerPoint PPT Presentation
Rahti container cloud service Aim of this aernoon: $ aragorn - - PowerPoint PPT Presentation
Rahti container cloud service Aim of this aernoon: $ aragorn GCF_000002945.1_ASM294v2_genomic.fna ARAGORN v1.2.38 Dean Laslett
Aim of this aernoon:
$ aragorn GCF_000002945.1_ASM294v2_genomic.fna ARAGORN v1.2.38 Dean Laslett Please reference the following paper if you use this program as part of any published research. Laslett, D. and Canback, B. (2004) ARAGORN, a program for the detection of transfer RNA and transfermessenger RNA genes in nucleotide sequences. Nucleic Acids Research, 32;1116. Searching for tRNA genes with no introns Searching for tmRNA genes Assuming circular topology, search wraps around ends Searching both strands Using standard genetic code NC_003424.3 Schizosaccharomyces pombe chromosome I, complete sequence 5579133 nucleotides in sequence Mean G+C content = 36.1% 1.
Part 1: Background
Rahti is a
Allows
Provisioning servers based on container technology with JSON API or web console. container cloud Platform as a Service (PaaS) based on OpenShi - Red Hat's distribution of Kubernetes
Containers
Container is a mechanism which encapsulates a vanilla collection of Linux resources for an application to use:
Containers
Own network, filesystem, process ids, user ids
/ $ ifconfig eth0 Link encap:Ethernet HWaddr 0A:58:0A:80:06:72 inet addr:10.128.6.114 Bcast:10.128.7.255 Mask:255.255.254.0 inet6 addr: fe80::d4d4:38ff:fe5e:6e2b/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:656 (656.0 B) TX bytes:656 (656.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Containers
Own network, filesystem, process ids, user ids
sh4.2$ ls anacondapost.log bin data dev etc home lib lib64 media mnt opt proc root run s
Containers
Own network, filesystem, process ids and user ids, ...
sh4.2$ ps axu USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 1016530+ 1 1.2 0.0 11680 1168 ? Ss 10:49 0:00 sh c (tail f /dev/null) 1016530+ 7 0.0 0.0 4396 356 ? S 10:49 0:00 tail f /dev/null 1016530+ 8 0.3 0.0 11816 1700 ? Ss 10:49 0:00 /bin/sh 1016530+ 15 0.0 0.0 51740 1732 ? R+ 10:49 0:00 ps axu
Rahti does not allow running containers as root. It always assigns varying user id. This is to prevent security issues.
Containers
They have a look and feel of a light weight virtual machine, but they are not virtual machines Rely on Linux kernel features Standardized container images Build once run everywhere Only Linux based images Standards: Docker, rkt, LXC, Singularity, katacontainers, Intel clear containers Rahti supports Docker images
Containers enable
Running soware with conflicting requirements on same server Run "Ubuntu" soware stack on CentOS host Security hardening Expose minimal amount of data to container Smaller container image smaller attack surface easier to maintain
→ →
Demo: Docker CLI shell
Rahti
OpenShi "community edition": OKD - The Origin Community Distribution of Kubernetes that powers Red Hat OpenShi. A Kubernetes implementation Kubernetes originally developed at Google Now maintained by Cloud Native Computing Foundation OpenShi skills translate to Kubernetes skills and vice versa Terms OpenShi and Kubernetes can be used interchangeably, but OpenShi has some additional features that Kubernetes hasn't Is a container orchestration platform that allows running Docker container images.
Rahti use cases
Databases Web services Computation Weird soware stacks High Availability services Anything that runs as a container One shot runs ( today's usecase) Anything that runs in a container and requires modest amount cpu/ram/disk #(cpu) RAM GB Disk GB
← ⪅ 2 ⪅ 8 ⪅ 100 … 1000
Part 2: Running workloads in Rahti
Running containers in Kubernetes: Pods
Pod
Containera
IP: 10.0.0.1
Volumes
pvca
Root volume Application binary Dependencies volumea volumeb /tmp /outputdata/
Containerb
/input /interm
Pod manages multiple containers Announces mountable volumes from persistent storage claims They all run physically near each
- ther
Containers in a pod share IP and memory Data in containers is ephemeral, container is reset when it is killed and restarted Root volume is locate at the compute node: SSD disk, no redundancy Persistent disk using volume mounts
Running containers in Kubernetes: Pods
Physical compute node
Storage cluster
Pod
Containera
IP: 10.0.0.1
Volumes
pvca
Root volume Application binary Dependencies volumea volumeb /tmp /outputdata/
Containerb
/input /interm
Pod manages multiple containers Announces mountable volumes from persistent storage claims They all run physically near each
- ther
Containers in a pod share IP and memory Data in containers is ephemeral, container is reset when it is killed and restarted Root volume is locate at the compute node: SSD disk, no redundancy Persistent disk using volume mounts
Object definitions in Kubernetes
Objects are defined as key-value maps Representation in YAML language Indentation matters, no tabs, suggestion is 2 spaces
Pod
Containera
IP: 10.0.0.1
Volumes
pvca
Root volume Application binary Dependencies volumea /data
apiVersion: v1 kind: Pod metadata: name: simple labels: job: analyze spec: volumes: name: volumea persistentVolumeClaim: claimName: pvca containers: name: containera image: centos:7 volumeMounts: mountPath: /data name: volumea
Brief intro to YAML files
YAML is a intermediate data language based on key-value pairs and lists: Just a value is a YAML file Key and value is signified with colon ":" (Value must be indented!) Lists are written with "[" and "]" or with "-" symbols:
"this is a valid yaml file" key: value
⇔
key: value list: value 1 value 2
⇔
list: value 1 value 2
⇔
list: [value 1, value 2]
↓
Brief intro to YAML files
Combining these we get hierarchical structures:
key: subkey: value of subkey subkey2: value of subkey2 subkey3: this is a list key2: value for key2
Header: Which version of API? Kind of the object Assign it a name and some labels Specification of the Pod Define volumes to be brought to the Pod Define containers in the pod There can be multiple, this is a list! Define where the volume is mounted in the container
Object definitions in Kubernetes: Pods
apiVersion: v1 kind: Pod metadata: name: simple labels: job: analyze spec: volumes: name: volumea persistentVolumeClaim: claimName: pvca containers: name: containera image: centos:7 volumeMounts: mountPath: /data name: volumea
How to submit a pod to rahti?
Use the oc command line tool Write the yaml-file Submit by
- c create f pod.yaml
Demo: Submitting Pod to Rahti
Did it work?
Web console
- c describe pod simple
Persistent volume claims - How to claim storage from the storage cluster?
Web console
Web console
Using YAML specification file
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvca spec: accessModes: ReadWriteOnce resources: requests: storage: 1Gi
Back to the Pod demo
Does it work now? OpenShi will run the container over and over again. But there's nothing to execute. We can specify command to run in the container.
$ oc describe pod simple ... Events: Type Reason Age From Message Warning FailedScheduling 1m (x15 over 4m) defaultscheduler persistentvolumeclaim Normal Scheduled 27s defaultscheduler Successfully assigned Normal Pulling 4s (x3 over 24s) kubelet, rahticompios55 pulling image "centos Normal Pulled 2s (x3 over 21s) kubelet, rahticompios55 Successfully pulled i Normal Created 2s (x3 over 21s) kubelet, rahticompios55 Created container Normal Started 1s (x3 over 21s) kubelet, rahticompios55 Started container Warning BackOff 1s (x3 over 17s) kubelet, rahticompios55 Backoff restarting f
The command specification
Defines ENTRYPOINT of the container Each entry of the command array is the n th argument. Here: "sh" is the 0th argument (command to execute) "c" is the 1st argument "(tail f /dev/null)" is the 2nd argument Equivalent with Parentheses start a sub-shell so tail wont run with process id 1 Edit the pod definition and replace it. Does it work now?
apiVersion: v1 kind: Pod metadata: name: simple labels: job: analyze spec: volumes: name: volumea persistentVolumeClaim: claimName: pvca containers: name: containera image: centos:7 volumeMounts: mountPath: /data name: volumea command: sh c (tail f /dev/null) sh c '(tail f /dev/null)'
Debug and file transfer
See if the pod is up and running. Remote shell connection to container, create file in the persistent volume mount: Copy testfile to local shell:
$ oc get pod NAME READY STATUS RESTARTS AGE simple 1/1 Running 0 1m $ oc rsh simple sh4.2$ echo test > /data/testfile sh4.2$ exit $ oc rsync simple:/data/testfile ./ WARNING: cannot use rsync: rsync not available in container testfile
A typical pattern in the command specification
Is equivalent with
command: sh c > echo first command && echo second command && echo third command sh c 'echo first command && echo second command && echo third command'
Run-once-Pod
OpenShi will try to run the pod again if it exits Use "restartPolicy: Never" in the container definition for run-once- containers Output of the pod:
$ oc logs simple Short payload apiVersion: v1 kind: Pod metadata: name: simple labels: job: analyze spec: restartPolicy: Never volumes: name: volumea persistentVolumeClaim: claimName: pvca containers: name: containera image: centos:7 volumeMounts: mountPath: /data name: volumea command: sh c echo Short payload
The oc tool
Syntax Meaning
- c create f
<fileName> Create object from file
- c replace [force]
f <fileName> Replace object with . Use force to delete old first.
- c delete
<objectType> <objectName> Delete object from cluster
- c describe
<objectType> <objectName> Display object status
- c logs <podName> [c
<containerName>]
- utput stdout of pod <podName>, optionally
that of container <containerName> Get help on <command>: oc <command> help
Syntax Meaning
- c status
Display current OpenShi top level project status
- c explain <objectType>.
<field>.<subField> Print out documentation
- c rsync <pod>:<filePath>
<localPath> Copy data from/to pod to/from local filesystem
- c rsh <pod>
Remote shell to container
- c projects
Show projects
- c project <name>
Switch to project <name>
- c newproject <name>
Create new OpenShi project <name>
- c get <objectType>
[<objectName>] Get object information
↑
OpenShi Project concept
OpenShi "projects" are namespaces for the Kubernetes Objects In single namespace there may be single object of given type-name-combination By default objects are not visible across projects Controlled with following commands: Syntax Meaning
- c projects
Show projects
- c project <name>
Switch to project <name>
- c newproject <name>
Create new OpenShi project <name>
Summary
Rahti runs Docker containers Containers are ephemeral Persistent storage must be requested and mounted to containers Pod is the OpenShi object that manages containers OpenShi objects are created from YAML files with oc create f <filename> Define command in the container specification to run specific binary Otherwise the image's default command is executed Projects/namespaces isolate applications
Exercises
Exercises are located at . github.com/CSCfi/rahti-bioweek-2019
Part 3: More OpenShi
Short security guide
Always review your container images Use curated image sources (e.g. biocontainers.pro) For servers use IP whitelisting when possible Keep webservices shortlived if possible Remember that you are the webmaster
Resource requests and limits
Use spec.containers.resources to specify how much resources to reserve. Fractions of CPU are possible: "200 millicores" Memory: M for Megabytes, G for Gigabytes Define only limit Kubernetes assigns equal requests Billing is done according to requests If not request/limit is in place default
- nes are chosen
CPU: 100 millicores / 2 cores Memory: 200 MiB / 8 GiB
apiVersion: v1 kind: Pod metadata: name: simple labels: job: analyze spec: restartPolicy: Never containers: name: containera image: centos:7 command: ["sh", "c", "echo Hello"] resources: requests: memory: "200M" cpu: "200m" limits: memory: "200M" cpu: "200m"
→
Running server jobs in Rahti
Problem
- 1. The Pod IP addresses are subject to
change
- 2. Pods are shut down if the node running
them fails
- 3. The Pod IPs are not visible to internet
- 4. Load balancing
- 5. Updating pods causes service
downtime
- 6. Passwords or private keys in container
images
- 7. Container image per configuration
Solution
- 1. Service or StatefulSet
- 2. Controllers
- 3. Route (OpenShi)
- 4. Service
- 5. DeploymentConfig (OpenShi) or
Deployments
- 6. Secret
- 7. ConfigMap
Service
Solves problems 1 and 4: Pod IPs need to be tracked and traffic needs load-balancing
Route
Roundabouts problem 3: " The Pod IPs are not visible to internet" by routing HTTP traffic to Service object in the cluster. Automatic route with hostnames <hostname>.rahtiapp.fi TLS encryption provided via rahtiapp.fi domain Any hostname possible, user needs to keep track of DNS and CA certificates Supports IP whitelisting
Pod 3 Container 1 Pod 2 Container 1 Pod 1 Container 1 IP: 10.10.0.1:8081 local dns: serve.namespace...
Service name: serve selector: app: "foo"
10.0.0.1:8080 app: "foo" 10.0.0.2:8080 app: "foo" 10.0.0.3:8080 app: "foo"
Route hostname: myserve.rahtiapp.fi internet Rahti
Controllers
Controllers are a class of objects that start pods according to specific rules Pod definitions are in the controllers' spec as a template ReplicaSet, ReplicationController, Deployments, DeploymentConfig, StatefulSet, CronJob, ... They Control pods They all solve Problem 2: "Pods are shut down if the node running them fails"
ReplicationController & ReplicaSet
Controllers that keep matching pods alive.
webv83k2 Container 1 app: "foo" webmf42f Container 1 app: "foo" web3mlg2 Container 1 app: "foo"
ReplicationController name: web replicas: n selector: {"app": "foo"}
StatefulSet
Controller that keeps unique pods alive and gives them always unique names. Hostnames: web0..n.
web0 Container 1 app: "foo" web1 Container 1 app: "foo" webn Container 1 app: "foo"
StatefulSet name: web replicas: n
Deployments / DeploymentConfig
Create ReplicaSets (Deployments) and ReplicationControllers (DeploymentConfig) Automates application upgrades DeploymentConfigs can watch when new container images appear in the internal image registry Deployment is generic Kubernetes object and DeploymentConfig is an OpenShi extension
Secrets and ConfigMaps
Secret and ConfigMap Environment variable Volume mount: Create files according to keys, populate contents according to value
Pod
Containera Volumes
confmapvol /etc/config secretvol /etc/secrets
configmap secret foo: "bar" foz: "baz"
foo baz
env variable
FOB=bar
configmap FOB: "bar"
And much more!
ImageStream (OpenShi): Watching image updates in internal registry BuildConfig (OpenShi): Building images in Rahti CronJob: running Pod at specific times initContainer: run-to-completion container executed before other containers in Pod HorizontalPodAutoscaler: starts up new pods according to server load Template: Collect number of objects to parametrizable lists
Rahti links
for end user documentation You need a CSC computing project to access Rahti No cost for you when you use Rahti for open research and education for support External documentation The Rahti main page Instructions for getting access rahti-support@csc.fi Public roadmap Kubernetes documentation OpenShi documentation
Presentation links
This presentation: Exercises: Github: exercises.pdf: exercises.tar.gz: https://object.pouta.csc.fi/rahti-bioweek/slides.pdf https://rahti-bioweek-2019.rahtiapp.fi/ https://github.com/CSCfi/rahti-bioweek-2019 https://object.pouta.csc.fi/rahti-bioweek/exercises.pdf https://object.pouta.csc.fi/rahti-bioweek/exercises.tar.gz