Potpourri Doug Woos Logistics notes Piazza!!! - - PowerPoint PPT Presentation

potpourri
SMART_READER_LITE
LIVE PREVIEW

Potpourri Doug Woos Logistics notes Piazza!!! - - PowerPoint PPT Presentation

Potpourri Doug Woos Logistics notes Piazza!!! https://piazza.com/washington/spring2017/cse452 In-class questions Outline - More Go - Remote procedure calls - MapReduce discussion More Go Hopefully you got the basics from section Today:


slide-1
SLIDE 1

Potpourri

Doug Woos

slide-2
SLIDE 2

Logistics notes

Piazza!!!

https://piazza.com/washington/spring2017/cse452

In-class questions

slide-3
SLIDE 3

Outline

  • More Go
  • Remote procedure calls
  • MapReduce discussion
slide-4
SLIDE 4

More Go

Hopefully you got the basics from section Today:

  • Doug’s go tips
  • Synchronization
  • Remote procedure calls
slide-5
SLIDE 5

Goroutines

Lightweight (“green”) threads Multiplexed onto $GOMAXPROCS OS threads If they block, make an OS thread Convenient syntax—if you realize you want to do something async, just add “go”

slide-6
SLIDE 6

If/else

This is wrong:

if x > 0 {
 // something
 }
 else {
 // something else
 }

This is right:

if x > 0 {
 // something
 } else {
 // something else
 }

slide-7
SLIDE 7

Anonymous functions

Handy when using go-routines

go func() {
 // do some work
 }()

But: careful with arguments What does this do?

for val := range values {
 go func() {
 fmt.Println(val)
 }()
 }

slide-8
SLIDE 8

Anonymous functions

Handy when using go-routines

go func() {
 // do some work
 }()

But: careful with arguments What does this do?

for val := range values {
 go func(val) {
 fmt.Println(val)
 }(val)
 }

slide-9
SLIDE 9

Communicating Sequential Processes

Hoare’s model for concurrency Locks (monitors): multiple threads access data, making sure to acquire lock CSP: one thread accesses data, other threads communicate via channels Use either, but not both for same data For this lab, just use channels Subsequent labs built around locks

slide-10
SLIDE 10

Locking

Mutexes in “sync” library—sync.mutex

import “sync” type Data struct {
 mu sync.mutex
 } func (wk *Worker) accessData(…) {
 wk.mu.Lock()
 defer wk.mu.Unlock()
 }

Advice: develop and follow a coherent system Lock at top level, require subroutines to be called with lock held (and add comments to that effect)

slide-11
SLIDE 11

Remote procedure calls

Request from a client to execute a function on a server Basic communication technique Today: Basic concepts, usage in lab 1 Next time: RPC semantics in detail

slide-12
SLIDE 12

Remote procedure calls

Differences between RPC and local call

  • Need to bind to server (like linking)
  • Performance
  • Failures—msg drop, client crash, server crash,

slowness

slide-13
SLIDE 13

RPC library

Read data Deserialize args

Transport CSE 461

RPC implementation

  • k := call(address, "Worker.DoJob",

args, &reply) func (wk *Worker) DoJob(args *DJArgs, reply *DJReply)

RPC library

Serialize args Open connection Write data Read data Deserialize reply Serialize reply Write data

Transport OS

TCP/IP write

OS

TCP/IP read TCP/IP write TCP/IP read

slide-14
SLIDE 14

RPC in Labs

Go “rpc” library We wrap it in some convenience functions You won’t have to manually register RPCs Important later: interface{} works fine Capitalization weirdly important

  • Capitalized fields on structs sent
  • Capitalized methods registered as RPCs
slide-15
SLIDE 15

Go RPCs: Server-side

RPCs have two args and return error code (or nil)

func Funcname(arg *FuncArgs, reply *FuncReply) error

(You can’t get the error, so just return nil)

slide-16
SLIDE 16

Go RPCs: Client-side

call function

  • k := call(address, “Type.Method”, args, &reply)

Returns a bool If ok is false, did the call happen?

  • For this lab, assume no
  • In future labs, ???
slide-17
SLIDE 17

RPCs in Lab 1

Worker and master communicate with each other Worker->master: registration

func (mr *MapReduce) Register(args *RegisterArgs, 
 res *RegisterReply) error

Master->worker: DoJob(map or reduce), Shutdown

func (wk *Worker) DoJob(arg *DoJobArgs, 
 res *DoJobReply) error func (wk *Worker) Shutdown(args *ShutdownArgs, res *ShutdownReply) error

slide-18
SLIDE 18

RPCs and Concurrency

Blocking on the client

  • MapReduce master has multiple outstanding jobs

Need thread per worker or thread per RPC Keep track of which jobs have been done Only start Reduce tasks once Map tasks done For part 3: put tasks back on queue if they fail

slide-19
SLIDE 19

RPCs and Concurrency

Concurrent on the server Not an issue in lab 1 In subsequent labs, need to lock

slide-20
SLIDE 20

MapReduce Discussion

What’s the deal with master failure? Why is atomic rename important? Why not store intermediate results in RAM?

  • Apache Spark

Aren’t some Reduce jobs much larger? What about infinite loops? Why does novelty matter?

slide-21
SLIDE 21

Since we have some time

I claimed that a Two Generals protocol is impossible Why?