Scaling up Partial Evaluation for Optimizing The Sun Commercial RPC - - PowerPoint PPT Presentation

scaling up partial evaluation for optimizing the sun
SMART_READER_LITE
LIVE PREVIEW

Scaling up Partial Evaluation for Optimizing The Sun Commercial RPC - - PowerPoint PPT Presentation

Scaling up Partial Evaluation for Optimizing The Sun Commercial RPC Protocol Gilles Muller Nic Volanschi Renaud Marlet COMPOSE group IRISA / INRIA - University of Rennes A realistic experiment It drove the development of Tempo,


slide-1
SLIDE 1

Scaling up Partial Evaluation for Optimizing The Sun Commercial RPC Protocol

Gilles Muller Nic Volanschi Renaud Marlet COMPOSE group IRISA / INRIA - University of Rennes

A realistic experiment It drove the development of Tempo, specializer for C It highlights key features of partial evaluators

1

slide-2
SLIDE 2

Overview of the talk

Sun RPC Opportunities for specialization Application of Tempo Discussion on important features

2

slide-3
SLIDE 3

Motivation What is RPC?

RPC makes a remote procedure look like a local one

Send arguments Receive results Send results Receive arguments RPC execution Marshal Unmarshal Marshal Unmarshal RPC Call

Server Stub Client Stub Application space System space

3

slide-4
SLIDE 4

Motivation Why optimize the Sun RPC?

Well recognized standard

– distributed system services: NFS, NIS – distributed computing environments: PVM, Stardust

Performance is critical

– manual: error prone, does not scale up – re-implementation: not compatible with the standard

) partial evaluation: reuse of existing code

4

slide-5
SLIDE 5

Sun RPC Architecture

A set of micro-layers Highly generic procedures IDL and rpcgen:

typedef struct { int int1, int2; } pair; program RMIN_PROG { version RMIN_VERS { int RMIN(pair) = 0; } = 1; } = 0x20000007;

) invariant for specialization

5

slide-6
SLIDE 6

Sun RPC Example: minimum of two integers (client encoding)

arg.int1 = ... arg.int2 = ... rmin_1(&arg) clnt_call() // Transport protocol switch clntudp_call() // UDP generic procedure call xdr_pair() // Encode 2 integers (rpcgen) xdr_int() // Integer size switch xdr_long() // Encoding/decoding XDR_PUTLONG() // Output protocol switch xdrmem_putlong() // Write buffer/check overflow htonl() // Big/little endian

6

slide-7
SLIDE 7

Opportunities for Specialization Propagation of Exit Status

bool_t xdr_pair(xdrs, objp) { if (!xdr_int(xdrs, &objp->int1)) { return (FALSE); } if (!xdr_int(xdrs, &objp->int2)) { return (FALSE); } return (TRUE); } STAT DYN Needed: interprocedural, partially static structures, return sensitivity

7

slide-8
SLIDE 8

Opportunities for Specialization Elimination of Encoding/Decoding Dispatch

bool_t xdr_long(xdrs,lp) { if( xdrs->x_op == XDR_ENCODE ) return XDR_PUTLONG(xdrs,lp); if( xdrs->x_op == XDR_DECODE ) return XDR_GETLONG(xdrs,lp); if( xdrs->x_op == XDR_FREE ) return TRUE; return FALSE; } STAT DYN Needed: interprocedural, partially static structures, return sensitivity

8

slide-9
SLIDE 9

Opportunities for Specialization Elimination of Buffer Overflow Checking

bool_t xdrmem_putlong(xdrs,lp) { if((xdrs->x_handy -= sizeof(long)) < 0) return FALSE; *(xdrs->x_private) = htonl(*lp); xdrs->x_private += sizeof(long); return TRUE; } STAT DYN Needed: interprocedural, partially static structures, return sensitivity, use sensitivity (different uses may have different binding times)

9

slide-10
SLIDE 10

Application of Tempo Specialized Arguments Encoding (sugared)

void xdr_pair(xdrs,objp) { *(xdrs->x_private) = objp->int1; xdrs->x_private += 4u; *(xdrs->x_private) = objp->int2; xdrs->x_private += 4u; }

10

slide-11
SLIDE 11

Application of Tempo Manual Interventions

Control flow for exceptions Exposition of specialization opportunities

inlen = dyn; code(inlen);

= )

if (inlen == expected) { inlen = expected; code(inlen); } else code(inlen); Needed: flow sensitivity STAT DYN

11

slide-12
SLIDE 12

Application of Tempo Speedups

20100 250 500 1000 2000

Array Size (4-Byte Integers)

1 2 3 4

Speedup Ratio for Client Marshalling

PC/Linux IPX/Sunos

20100 250 500 1000 2000

Array Size (4-Byte Integers)

1.0 1.2 1.4 1.6

Speedup Ratio for RPC Round Trip Time

PC/Linux - Ethernet 100Mbits IPX/Sunos - ATM 100Mbits

12

slide-13
SLIDE 13

Discussion Key Features in Binding-Time Analyses

Essential for specializing the Sun RPC:

Interprocedural analyses Partially static structures Use sensitivity Return sensitivity Flow sensitivity

Useful but not essential:

Context sensitivity

13

slide-14
SLIDE 14

Discussion User Interface

Crucial to the “tuning” phase:

Colors for binding times and program transformations Aliases Polyvariance Use of global variables

Useful features:

Analysis and specialization context

Drawback:

Code transformation (SUIF

, Tempo)

14

slide-15
SLIDE 15

Discussion Module-Oriented Specialization

Analysis context prior to code to specialize Analysis context after to code to specialize Evaluation of external calls Abstract description of external functions

– binding times – aliases

15

slide-16
SLIDE 16

Conclusion

Large scale experiment on existing, mature, commercial code Significant speedups Key features in BTA Key features in user interface PE for suppressing modularity and genericity overhead

16