Remote Procedure Call Arvind Krishnamurthy Course Logistics - PowerPoint PPT Presentation

Remote Procedure Call Arvind Krishnamurthy

Course Logistics • Everyone should have a gitlab account • Let us know if you don’t have one • Make sure you have signed up for Piazza • Lab 1 due next Thursday • Submission through Canvas • Blog post for Friday’s reading • Submission through Canvas

Muddy Foreheads • 𝑜 children, 𝑙 get mud on their foreheads • Children sit in circle. • Teacher announces, "Someone has mud on their forehead." • Someone == 1 or more • No on can see their own forehead • 𝑙 is not “common knowledge”

Muddy Foreheads • 𝑜 children, 𝑙 get mud on their foreheads • Children sit in circle. • Teacher announces, "Someone has mud on their forehead." • Teacher repeatedly asks, "Raise your hand if you know you have mud on your forehead." • What happens?

Muddy Foreheads X • 𝑜 children, 𝑙 get mud on their foreheads • Children sit in circle. • Teacher announces, "Someone has mud on their forehead." • Teacher repeatedly asks, "Raise your hand if you know you have mud on your forehead." • What happens? Y

Muddy Foreheads X • 𝑜 children, 𝑙 get mud on their foreheads • Children sit in circle. • Teacher announces, "Someone has mud on their forehead." • Teacher repeatedly asks, "Raise your hand if you know you have mud on your Z forehead." • What happens? Y

Muddy Foreheads (contd.) • Claim: – The first k-1 times the teacher asks, all children will reply “No” – The k-th time all dirty children will reply “Yes” • Reasoning by considering cases and using induction: – k=1: the child with a muddy forehead will say yes – k=2: let X and Y have muddy foreheads • Each sees exactly one other person with muddy forehead • In round 1, X noticed Y didn’t say “Yes” – Possible only because Y must have seen a child with a muddy forehead ==> X must have mud

The Muddy Forehead "Paradox" If 𝑙 >1, the teacher didn't say anything anyone didn't already know!

Why Are Distributed Systems Hard? • Asynchrony – Different nodes run at different speeds – Messages can be unpredictably, arbitrarily delayed • Failures (partial and ambiguous) – Parts of the system can crash – Can’t tell crash from slowness • Concurrency and consistency – Replicated state, cached on multiple nodes – How to keep many copies of data consistent?

Why Are Distributed Systems Hard? • Performance – Have to efficiently coordinate many machines – Performance is variable and unpredictable – Tail latency: only as fast as slowest machine • Testing and verification – Almost impossible to test all failure cases – Proofs (emerging field) are really hard • Security – Need to assume adversarial nodes

MapReduce Computational Model For each key k with value v, compute a new set of key- value pairs: map (k,v) → list(k’,v’) For each key k’ and list of values v’, compute a new (hopefully smaller) list of values: reduce (k’,list(v’)) → list(v’’) User writes map and reduce functions. Framework takes care of parallelism, distribution, and fault tolerance.

MapReduce (or ML or …) Architecture • Scheduler accepts MapReduce jobs – finds a MapReduce master and set of avail workers • For each job, MapReduce master <array> – farms tasks to workers; restarts failed jobs; syncs task completion • Worker <array> – executes Map and Reduce tasks • Storage <array> – stores initial data set, intermediate files, end results

Remote Procedure Call (RPC) A request from the client to execute a function on the server. – To the client, looks like a procedure call – To the server, looks like an implementation of a procedure call

Remote Procedure Call (RPC) A request from the client to execute a function on the server. On client • Ex: result = DoMap(worker, i) – Parameters marshalled into a message (can be arbitrary types) – Message sent to server (can be multiple pkts) – Wait for reply – On server • message is parsed – operation DoMap(i) invoked – Result marshalled into a message (can be multiple pkts) – Message sent to client –

x x RPC implementation Map(worker, i) DoMap(worker, i) RPC library RPC library Serialize args Read data Read data Serialize reply Open connection Deserialize reply Write data Deserialize args Write data OS OS TCP/IP write TCP/IP write TCP/IP read TCP/IP read x Transport Transport CSE 461

RPC vs. Procedure Call • What is equivalent of: – The name of the procedure? – The calling convention? – The return value? – The return address?

RPC vs. Procedure Call Binding – Client needs a connection to server – Server must implement the required function – What if the server is running a different version of the code? Performance – procedure call: maybe 10 cycles = ~3 ns – RPC in data center: 10 microseconds => ~1K slower – RPC in the wide area: millions of times slower

RPC vs. Procedure Call Failures – What happens if messages get dropped? – What if client crashes? – What if server crashes? – What if server crashes after performing op but before replying? – What if server appears to crash but is slow? – What if network partitions?

Semantics • Semantics = meaning • reply == ok => ??? • reply != ok => ???

Semantics • At least once (NFS, DNS) – true: executed at least once – false: maybe executed, maybe multiple times • At most once – true: executed once – false: maybe executed, but never more than once • Exactly once – true: executed once – false: never returns false

At Least Once RPC library waits for response for a while If none arrives, re-send the request Do this a few times Still no response -- return an error to the application

Non-replicated key/value server Client sends Put k v Server gets request, but network drops reply Client sends Put k v again – should server respond "yes"? – or "no"? What if op is “append”?

Does TCP Fix This? • TCP: reliable bi-directional byte stream between two endpoints – Retransmission of lost packets – Duplicate detection • But what if TCP times out and client reconnects? – Browser connects to Amazon – RPC to purchase book – Wifi times out during RPC – Browser reconnects

When does at-least-once work? • If no side effects – read-only operations (or idempotent ops) • Example: MapReduce • Example: NFS – readFileBlock – writeFileBlock

At Most Once Client includes unique ID (UID) with each request use same UID for re-send – Server RPC code detects duplicate requests – return previous reply instead of re-running handler if seen[uid] { r = old[uid] } else { r = handler() old[uid] = r seen[uid] = true }

Some At-Most-Once Issues How do we ensure UID is unique? – Big random number? – Combine unique client ID (IP address?) with seq #? – What if client crashes and restarts? Can it reuse the same UID? – In labs, nodes never restart – Equivalent to: every node gets new ID on start

When Can Server Discard Old RPCs? Option 1: Never? Option 2: unique client IDs per-client RPC sequence numbers client includes "seen all replies <= X" with every RPC Option 3: only allow client one outstanding RPC at a time arrival of seq+1 allows server to discard all <= seq Labs use Option 3

What if Server Crashes? If at-most-once list of recent RPC results is stored in memory, server will forget and accept duplicate requests when it reboots – Does server need to write the recent RPC results to disk? – If replicated, does replica also need to store recent RPC results? In Labs, server gets new address on restart – Client messages aren’t delivered to restarted server

Remote Procedure Call Arvind Krishnamurthy Course Logistics - PowerPoint PPT Presentation

Remote Procedure Call Arvind Krishnamurthy Course Logistics Everyone should have a gitlab account Let us know if you dont have one Make sure you have signed up for Piazza Lab 1 due next Thursday Submission through Canvas

Remote Procedure Calls Dan Savel, dxs221 EECS 338, Spring 2011 What is a Remote Procedure Call?

(g)RPC - Remote Procedure Call February 13, 2019 Remote Procedure Call (RPC) a form of

Data Representation and Data Representation and Remote Procedure Calls Remote Procedure Calls

Problem Chapter 5.3: Suppose you want to make a procedure call Remote Procedure Calls Caller

REMOTE PROCEDURE CALLS Steve Vinoski Basho Technologies, Cambridge, MA USA @stevevinoski,

Remote Procedure Calls (RPCs) and Remote Method Invocation (RMI) CS425/ECE428 SPRING 2019

A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE

Lab 1 - Java RMI RMI Remote Method Invocation Object-oriented RPC (Remote Procedure Call)

Supporting Procedure Call Supporting Procedure Call Procedures (or functions) are a crucial

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

Optimizing Procedure Calls Inlining Procedure calls can be costly (A.k.a. procedure integration,

Distributed Objects and Remote Invocation Programming Models for Distributed Applications

PWSCF and new charge density PWSCF call read_input_file (input.f90) call run_pwscf call setup

DTCP + Remote Access Proposal for Discussion with 3S October 28, 2009 1 Remote Access (RA)

COLLARTS SOURCING REMOTE INTERNSHIPS WHAT IS A REMOTE INTERNSHIP? COLLARTS REMOTE INTERNSHIPS

Todays Topics 1. Procedures and Procedure Call 2. Array and Record Types Procedure and

K NOWLEDGE AND C OMMON K NOWLEDGE IN A D ISTRIBUTED E NVIRONMENT Ellis Michael A DMINISTRIVIA

An Aspect-Oriented Behavioral Interface Specification Language FLACOS '08, Malta Takuo Watanabe

clp(pfd(Y)) : Constraints for Probabilistic Reasoning in Logic Programming Nicos Angelopoulos

r t t

Reducing Drilling Risks in J bend Wells Targeting basement in Tectonic Area through Geomechanical

Determination of nucleon sigma terms I Lukas Varnhorst for the BMW collaboration University of

Ranking Continuous Probabilistic Datasets Jian Li, University of Maryland, College Park Joint

02291: System Integration MUD Game Design The task of this exercise is to create a design