CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 - - PowerPoint PPT Presentation

cs 744 big data systems
SMART_READER_LITE
LIVE PREVIEW

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 - - PowerPoint PPT Presentation

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Assignment 1: Due Oct 1 - Sign up for Project meetings - Group updates MapReduce GFS BigTable BORG: WORKLOAD Long-running services (should never go down) Batch


slide-1
SLIDE 1

CS 744: Big Data Systems

Shivaram Venkataraman Fall 2018

slide-2
SLIDE 2

ADMINISTRIVIA

  • Assignment 1: Due Oct 1
  • Sign up for Project meetings
  • Group updates
slide-3
SLIDE 3

MapReduce GFS BigTable

slide-4
SLIDE 4

BORG: WORKLOAD

Long-running services (should “never” go down) Batch jobs: few seconds to a few days

slide-5
SLIDE 5

BORG CONCEPTS

Users submit jobs Each job is one or more tasks All tasks that run the same program (binary) Each job runs in one Borg cell

slide-6
SLIDE 6

JOB DESCRIPTION

slide-7
SLIDE 7

JOB PROPERTIES

Name Constraints Properties

  • Resource requirements
  • No slots!
  • Static Binaries
slide-8
SLIDE 8

JOB LIFECYLE

slide-9
SLIDE 9

QUOTAS, PRIORITIES, BNS

Priority High priority can preempt lower priority Quotas Used for admission control Infinite quota at priority zero Service Discovery using BNS

slide-10
SLIDE 10

ARCHITECTURE

slide-11
SLIDE 11

MASTER, Borglet

BorgMaster Single Leader, five-ways replicated Paxos group – using Chubby locks Borglet Daemon on each machine Borgmaster pulls updates from Borglets Health checks used to detect failures

slide-12
SLIDE 12

SCHEDULER

  • Feasibility checking pass, Scoring pass
  • Task cache (static binaries)
  • Scalability
  • Split master into multiple processes
  • Use replicas for communication
  • Randomize machines used for scoring

slide-13
SLIDE 13

UTILIZATION: CELL COMPACTION

slide-14
SLIDE 14

REQUEST SIZE: NO SWEET SPOT

slide-15
SLIDE 15

RECLAMATION

slide-16
SLIDE 16

LESSONS, DISCUSSION

  • Jobs are restrictive, Allocs are useful
  • IP address per container
  • Kernel of distributed operating system
slide-17
SLIDE 17

QUESTIONS / DISCUSSION ?