Inlining Java Native Calls at Runtime (CASCON 2005 4 th Workshop on - - PowerPoint PPT Presentation

inlining java native calls at runtime
SMART_READER_LITE
LIVE PREVIEW

Inlining Java Native Calls at Runtime (CASCON 2005 4 th Workshop on - - PowerPoint PPT Presentation

Inlining Java Native Calls at Runtime (CASCON 2005 4 th Workshop on Compiler Driven Performance) Levon Stepanian, Angela Demke Brown Computer Systems Group Department of Computer Science, University of Toronto Allan Kielstra, Gita Koblents,


slide-1
SLIDE 1

Inlining Java Native Calls at Runtime

(CASCON 2005 – 4th Workshop on Compiler Driven Performance)

Levon Stepanian, Angela Demke Brown Computer Systems Group Department of Computer Science, University of Toronto Allan Kielstra, Gita Koblents, Kevin Stoodley IBM Toronto Software Lab

slide-2
SLIDE 2

In a nutshell

  • Runtime native function inlining into Java
  • Optimizing transformations on inlined JNI calls
  • Opaque and binary-compatible while boosting

performance

Java Code

Native Code

Native Function Call

slide-3
SLIDE 3

In a nutshell

  • Runtime native function inlining into Java
  • Optimizing transformations on inlined JNI calls
  • Opaque and binary-compatible while boosting

performance

Java Code

Native Code inlined

slide-4
SLIDE 4

In a nutshell

  • Runtime native function inlining into Java
  • Optimizing transformations on inlined JNI calls
  • Opaque and binary-compatible while boosting

performance

Java Code

Native Code inlined +

  • ptimized
slide-5
SLIDE 5

Motivation

  • The JNI
  • Java’s interoperability API
  • Callouts and callbacks
  • Opaque
  • Binary-compatible

Java App Native (Host) Environment Native App/Lib JVM + JIT Callout Callback JNI

slide-6
SLIDE 6

Motivation

  • The JNI
  • Pervasive
  • Legacy codes
  • Performance-critical, architecture-dependent
  • Features unavailable in Java (files, sockets

etc.)

slide-7
SLIDE 7

Motivation

  • Callouts run to 2 to 3x slower than Java calls
  • Callback overheads are an order of magnitude larger
  • JVM handshaking requirements for threads leaving and re-

entering JVM context

  • i.e. stack switching, reference collection, exception handling
  • JIT compiler can’t predict side-effects of native

function call

slide-8
SLIDE 8

Our Solution

  • JIT compiler based optimization that inlines

native code into Java

  • JIT compiler transforms inlined JNI

function calls to constants, cheaper

  • perations
  • Inlined code exposed to JIT compiler
  • ptimizations
slide-9
SLIDE 9

Infrastructure

  • IBM TR JIT Compiler + IBM J9 VM
  • Native IL to JIT IL conversion mechanism
  • Exploit Native IL stored in native libraries
  • W-Code to TR-IL at runtime

TR JIT Machine code Static compiler IL

+

slide-10
SLIDE 10

Outline

  • Background Information ➼
  • Method
  • Results
  • Future Work
slide-11
SLIDE 11

Sample Java Class

class SetFieldXToFive{ public int x; public native foo(); static{ System.loadLibrary(…); } }

slide-12
SLIDE 12

Sample Java Class

class SetFieldXToFive{ public int x; public native foo(); static{ System.loadLibrary(…); } }

slide-13
SLIDE 13

Sample Native Code

JNIEXPORT void JNICALL Java_SetFieldXToFive_foo (JNIEnv * env, jobject obj){ jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5); }

GOAL: obj.x = 5

slide-14
SLIDE 14

Sample Native Code

JNIEXPORT void JNICALL Java_SetFieldXToFive_foo (JNIEnv * env, jobject obj){ jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5); }

GOAL: obj.x = 5

SetFieldXToFive

slide-15
SLIDE 15

Sample Native Code

JNIEXPORT void JNICALL Java_SetFieldXToFive_foo (JNIEnv * env, jobject obj){ jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5); }

GOAL: obj.x = 5

slide-16
SLIDE 16

Sample Native Code

JNIEXPORT void JNICALL Java_SetFieldXToFive_foo (JNIEnv * env, jobject obj){ jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5); }

GOAL: obj.x = 5

slide-17
SLIDE 17

Sample Native Code

JNIEXPORT void JNICALL Java_SetFieldXToFive_foo (JNIEnv * env, jobject obj){ jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5); }

GOAL: obj.x = 5

slide-18
SLIDE 18

Native Inlining Overview

  • 1. Inliner detects a native callsite
  • 2. Extracts and converts Native IL to JIT IL
  • 3. Identifies inlined JNI calls
  • 4. Transforms inlined JNI calls
  • 5. Finishes inlining
slide-19
SLIDE 19

Method – Step 1

  • 1. Inliner detects a

native callsite

Inliner Java Code

Call to obj.foo()

foo(){…}

(Native code)

TR JIT

slide-20
SLIDE 20

Method – Step 2

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

Native IL JIT IL

slide-21
SLIDE 21

Method – Step 3

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

JIT IL

/* call to GetObjectClass */ … /* call to GetFieldID */ … /* call to SetFieldID */ …

Pre-constructed IL shapes

slide-22
SLIDE 22

Method – Step 4

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls jclass cls = (*env)->GetObjectClass(env,obj); jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return; (*env)->SetIntField(env,obj,fid,5);

slide-23
SLIDE 23

Method – Step 4

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls (*env)->SetIntField(env,obj,fid,5); Constant: SetFieldXToFive class data structure jfieldID fid = (*env)->GetFieldID(env,cls,“x","I"); if (fid == NULL) return;

slide-24
SLIDE 24

Method – Step 4

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls (*env)->SetIntField(env,obj,fid,5); Constant: SetFieldXToFive class data structure Constant: Offset of field “x”

slide-25
SLIDE 25

Method – Step 4

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls Constant: SetFieldXToFive class data structure Constant: Offset of field “x” JIT IL: obj.x = 5

slide-26
SLIDE 26

The Big Picture

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls

  • 5. Finishes inlining

Before Native Inlining & Callback Transformations Inliner Java Code

Call to obj.foo()

TR JIT

foo(){…}

(Native code)

slide-27
SLIDE 27

The Big Picture

After Native Inlining & Callback Transformations

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls

  • 5. Finishes inlining

Inliner Java Code

  • bj.x = 5

TR JIT

foo(){…}

(Native code)

slide-28
SLIDE 28

The Big Picture

After Native Inlining & Callback Transformations

  • 1. Inliner detects a

native callsite

  • 2. Extracts and converts

Native IL to JIT IL

  • 3. Identifies inlined JNI

calls

  • 4. Transforms inlined

JNI calls

  • 5. Finishes inlining

Inliner Java Code

  • bj.x = 5

TR JIT

slide-29
SLIDE 29

Outline

  • Background Information ➼
  • Method ➼
  • Results
  • Future Work
slide-30
SLIDE 30

Experimental Setup

  • Native function microbenchmarks
  • Average of 300 million runs
  • 1.4 GHz Power4 setup
  • Prototype implementation
slide-31
SLIDE 31

Cost of IL Conversion

  • 5.3 microseconds per W-Code

1 2 3 4 5 6 7

bzip2 crafty gap gcc gzip mcf parser perlbmk twolf vortex vpr

SPEC CINT2000 Benchmarks Time per Opcode (microsecs.)

slide-32
SLIDE 32

Inlining Null Callouts

  • Null native method microbenchmarks
  • Varying numbers of args (0, 1, 3, 5)
  • Complete removal of call/return overhead
  • Gain back 2 to 3x slowdown
  • confirmed our expectations
slide-33
SLIDE 33

Inlining Non-Null Callouts

1.8 5.5 hash Static Instance Microbenchmark Test Speedup (X)

  • smaller speedups for natives performing work
  • instance vs. static speedup
slide-34
SLIDE 34

Inlining & Transforming Callbacks

  • Reclaim order of magnitude overhead

11.8 12.9 CallVoidMethod Static Instance Microbenchmark Test Speedup (X)

slide-35
SLIDE 35

Data-Copy Speedups

  • Transformed GetIntArrayRegion

Array Length

Speedup (X)

slide-36
SLIDE 36

Exposing Inlined Code To JIT Optimizations

93.4 GetArrayLength Speedup (X) Microbenchmark Test

FindClass GetMethodID NewCharArray GetArrayLength

slide-37
SLIDE 37

Conclusion

  • Runtime native function inlining into Java code
  • Optimizing transformations on inlined Java Native

Interface (JNI) calls

  • JIT optimize inlined native code
  • Opaque and binary-compatible while boosting

performance

  • Future Work
  • Engineering issues
  • Heuristics
  • Larger interoperability framework
slide-38
SLIDE 38

Fin