Virtualization instructor: Peter Baumann email: - - PowerPoint PPT Presentation

virtualization
SMART_READER_LITE
LIVE PREVIEW

Virtualization instructor: Peter Baumann email: - - PowerPoint PPT Presentation

Virtualization instructor: Peter Baumann email: p.baumann@jacobs-university.de tel: -3178 office: Research 1, room 88 It was much nicer before people started storing all their data in the Cloud. 340151 Big Databases & Cloud


slide-1
SLIDE 1

1 340151 Big Databases & Cloud Services (P. Baumann)

Virtualization

instructor: Peter Baumann email: p.baumann@jacobs-university.de tel:

  • 3178
  • ffice:

Research 1, room 88

“It was much nicer before people started storing all their data in the Cloud.”

slide-2
SLIDE 2

3 340151 Big Databases & Cloud Services (P. Baumann)

Hardware Scalability

  • Vertical scaling:

expand machine

  • Horizontal scaling:

more (smaller) machines

...

slide-3
SLIDE 3

4 340151 Big Databases & Cloud Services (P. Baumann)

Vertical Scaling: Supercomputer

[computerhistory.org]

slide-4
SLIDE 4

5 340151 Big Databases & Cloud Services (P. Baumann)

Horizontal Scaling: Cluster

  • Goal: more compute power, fault tolerance – cheap
  • Commodity hardware
  • Approach: horizontal scalability
  • cluster = (loosely or tightly) connected computers working together,

appearing as single system

  • each node same task
  • clustering middleware = software controlling & scheduling
  • Related
  • Amdahl’s Law: predict theoretical speedup when using multiple processors
  • more recently: Playstation clusters, Xbox clusters
slide-5
SLIDE 5

6 340151 Big Databases & Cloud Services (P. Baumann)

Horizontal Scaling: Beowulf Cluster

[Hoffman & Hargrove, ORNL]

slide-6
SLIDE 6

7 340151 Big Databases & Cloud Services (P. Baumann)

Horizontal Scaling: Supercomputers Today

TaihuLight: 10,649,600 cores in 40,960 nodes; 1,3 TB RAM; 93 PFlop/s

[top500.org / Natl Supercomputing Center, Wuxi, China]

slide-7
SLIDE 7

8 340151 Big Databases & Cloud Services (P. Baumann)

Virtualization

  • Problem: just-in-time resource provisioning
  • Approach:
  • Outsourcing to service provider
  • Virtual Machine (VM) to share

computer resources on demand

  • IaaS, PaaS, SaaS, ...
  • Many commercial providers
  • including Amazon AWS, Microsoft Azure, T-Systems,

...down to local providers

[rackspace.com]

slide-8
SLIDE 8

9 340151 Big Databases & Cloud Services (P. Baumann)

Virtual Machines

  • Virtual Machine (VM) = computer application resembling a

complete “computer”

  • Host system running 1..* guest systems
  • Technically:
  • application invokes guest OS services
  • Guest OS calls intercepted,

forwarded to host OS

  • Host OS fulfills request
  • Hypervisor = virtual machine monitor
  • resource orchestration (VM start, operation, stop)
  • Running on host
  • Data can be local or mounted from remote (ex: SAN)

[Dataveneta]

slide-9
SLIDE 9

10 340151 Big Databases & Cloud Services (P. Baumann)

Virtual Machines & Containers

  • Problem: Large VM overhead of Virtual Machine
  • Launch time ~1min
  • Oversized: most libraries, tools, etc. never needed
  • Costly updates
  • Approach: Containerization = operating-system-level virtualization

= OS feature where kernel allows multiple isolated user-space instances

  • called containers, partitions, virtualization engines (VEs), chroot jail, …
  • Ex: Docker
  • high-level API providing lightweight containers that run processes in isolation
  • [Solomon Hykes, Andrea Luzzardi, Francois-Xavier Bourlet et al]
slide-10
SLIDE 10

11 340151 Big Databases & Cloud Services (P. Baumann)

Kubernetes

  • automating deployment, scaling, management of containerized applications
  • group containers that make up an application into logical units
  • easy management & discovery
  • Open source by Google:

kubernetes.io

[blog.newsrelic.com]

slide-11
SLIDE 11

12 340151 Big Databases & Cloud Services (P. Baumann)

Dask

  • parallelism for python analytics, enabling performance at scale
  • Dynamic task scheduling
  • “Big Data” collections larger-than-memory / distributed environments
  • Open source: dask.org