High-Level Language VM Outline Introduction Virtualizing - - PowerPoint PPT Presentation

high level language vm outline
SMART_READER_LITE
LIVE PREVIEW

High-Level Language VM Outline Introduction Virtualizing - - PowerPoint PPT Presentation

High-Level Language VM Outline Introduction Virtualizing conventional ISA Vs. HLL VM ISA Pascal P-code virtual machine OO HLL virtual machines properties, architecture, terms Implementation of HLL virtual machine


slide-1
SLIDE 1

EECS 768 Virtual Machines 1

High-Level Language VM – Outline

  • Introduction
  • Virtualizing conventional ISA Vs. HLL VM ISA
  • Pascal P-code virtual machine
  • OO HLL virtual machines

– properties, architecture, terms

  • Implementation of HLL virtual machine

– class loading, security, GC, JNI

slide-2
SLIDE 2

EECS 768 Virtual Machines 2

Introduction

  • HLL PVM similar to a conventional PVM

– V-ISA not designed for a real hardware processor

HLL Program Intermediate Code Memory Image Object Code (ISA) Compiler front-end Compiler back-end Loader HLL Program Portable Code ( Virtual ISA ) Host Instructions

  • Virt. Mem. Image

Compiler VM loader VM Interpreter/Translator Traditional HLL VM

slide-3
SLIDE 3

EECS 768 Virtual Machines 3

Virtualizing Conventional ISA Vs. High-Level-Language VM ISA

  • Drawbacks of virtualizing a conventional ISA

– not developed for being virtualized! – operating system dependencies – issues with fjxed-size address space, page-size – memory address formation – maintaining precise exceptions – instruction set features – instruction discovery during indirect jumps – self-modifying and self-referencing code

slide-4
SLIDE 4

EECS 768 Virtual Machines 4

C-ISA Not for Being Virtualized

  • Conventional ISA

– after the fact solution for portability – no built-in ISA support for virtualization

  • High-level language V-ISA

– VM based portability is a primary design goal – generous use of metadata – metadata allows better type-safe code verifjcation, interoperability, and performance

slide-5
SLIDE 5

EECS 768 Virtual Machines 5

Operating System Dependencies

  • Conventional ISA

– most diffjcult to emulate – exact emulation may be impossible (difgerent OS)

  • High-level language V-ISA

– fjnd a least common denominator set of functions – programs interact with the library API – library interface is higher level than conventional OS interface

slide-6
SLIDE 6

EECS 768 Virtual Machines 6

Memory Architecture

  • Conventional ISA

– fjxed-size address spaces – specifjc addresses visible to user programs

  • High-level language V-ISA

– abstract memory model of indefjnite size – memory regions allocated based on need – actual memory addresses are never visible – out-of-memory error reported if process requests more that is available of platform

slide-7
SLIDE 7

EECS 768 Virtual Machines 7

Memory Address Formation

  • Conventional ISA

– unrestricted address computation – diffjcult to protect runtime from un- authorized guest program accesses

  • High-level-language V-ISA

– pointer arithmetic not permitted – memory access only through explicit memory pointers – static/dynamic type checking employed

slide-8
SLIDE 8

EECS 768 Virtual Machines 8

Precise Exceptions

  • Conventional ISA

– many instructions trap, precise state needed – global fmags enable/disable exceptions

  • High-level language V-ISA

– few instructions trap – test for exception encoded in the program – requirements for precise exceptions are relaxed

slide-9
SLIDE 9

EECS 768 Virtual Machines 9

Instruction Set Features

  • Conventional ISA

– guest ISA registers > host registers is a problem – ISAs with condition codes are diffjcult to emulate

  • High-level language V-ISA

– stack-oriented – condition codes are avoided

slide-10
SLIDE 10

EECS 768 Virtual Machines 10

Instruction Discovery

  • Conventional ISA

– indirect jumps to potentially arbitrary locations – variable-length instruction, embedded data, padding

  • High-level-language V-ISA

– restricted indirect jumps – no mixing of code and data – variable-length instructions permitted

slide-11
SLIDE 11

EECS 768 Virtual Machines 11

Self-Modifying/Referencing Code

  • Conventional ISA

– pose problems for translated code

  • High-level language V-ISA

– self-modifying and self-referencing code not permitted

slide-12
SLIDE 12

EECS 768 Virtual Machines 12

Pascal P-code

  • Popularized the Pascal language

– simplifjed porting of a Pascal compiler

  • Introduced several concepts used in HLL VMs

– stack-based instruction set – memory architecture is implementation independent – undefjned stack and heap sizes – standard libraries used to interface with the OS

  • Objective was compiler portability (and application

portability)

slide-13
SLIDE 13

EECS 768 Virtual Machines 13

Pascal P-Code (2)

  • Protection via trusted interpreter.
  • Advantages

– porting is simplifjed

  • don't have to develop compilers for all

platforms – VM implementation is smaller/simpler than a compiler – VM provides concise defjnition of semantics

  • Disadvantages

– achieving OS independence reduces API functionality to least common denominator – tendency to add platform-specifjc API extensions

slide-14
SLIDE 14

EECS 768 Virtual Machines 14

Object Oriented HLL Virtual Machines

  • Used in a networked computing environment
  • Important features of HLL VMs

– security and protection

  • protect remote resources, local fjles, VM

runtime – robustness

  • OOP model provides component-based

programming, strong type-checking, and garbage collection – networking

  • incremental loading, and small code-size

– performance

  • easy code discovery allows entire method

compilation

slide-15
SLIDE 15

EECS 768 Virtual Machines 15

T erminology

  • Java Virtual Machine Architecture  CLI

– analogous to an ISA

  • Java Virtual Machine Implementation

CLR

– analogous to a computer implementation

  • Java bytecodes  Microsoft

Intermediate Language (MSIL), CIL, IL

– the instruction part of the ISA

  • Java Platform  .NET framework

– ISA + Libraries; a higher level ABI

slide-16
SLIDE 16

EECS 768 Virtual Machines 16

Modern HLL VM

  • Compiler frontend produces binary fjles

– standard format common to all architectures

  • Binary fjles contain both code and metadata

Metadata Code Machine Independent Program File Loader Virtual Machine Implementation Interpreter Internal Data Structures Translator Native Code

slide-17
SLIDE 17

EECS 768 Virtual Machines 17

Security

  • A key aspect of modern

network-oriented Vms – “protection sandbox”

  • Must protect:

– remote resources (fjles) – local fjles – runtime

  • Java's fjrst generation

security method – still the default

Public File

Remote System

Other File

Local System

Accessible Local File application VMM Other Local File Network User Process Sandbox Boundary

slide-18
SLIDE 18

EECS 768 Virtual Machines 18

Protection Sandbox

  • Remote resources

– protected by remote system

  • Local resources

– protected by security manager

  • VM software

– protected via static/dynamic checking

class file class file class file class file

Emulation Engine loader

native method native method lib. method lib. method loaded method loaded method loaded method loaded method loaded method loaded method

Network, File System trusted

trusted trusted

local file

security agent

trusted

local file

standard libraries

slide-19
SLIDE 19

EECS 768 Virtual Machines 19

Java 1.1 Security: Signing

  • Identifjes source of the input program

– can implement difgerent security policies for programs from difgerent vendors

Binary Class hash encrypt Transmit Binary Class Signed Hash hash decrypt private key public key compare match => signature OK

slide-20
SLIDE 20

EECS 768 Virtual Machines 20

Java 2 Security: Stack Walking

  • Inspect privileges of

all methods on stack – append method permissions – method 4 attempts to write fjle B via io.method5 – call fails since method2 does not have privileges

Method 1 Method 2 Method 3 Method 4 System System Untrusted Untrusted principal Full Full Write A

  • nly

Write B

  • nly

permissions Method 5 (in io API) System Full Check Method System Full

Inspect Stack

  • peration

prohibited

X

slide-21
SLIDE 21

EECS 768 Virtual Machines 21

Garbage Collection

  • Issues with traditional malloc/free,

new/delete

– explicit memory allocation places burden on programmer – dangling pointer, double free errors

  • Garbage collection

– objects with no references are garbage – must be collected to free up memory

  • for future object allocation
  • OS limits memory use by a process

– eliminates programmer pointer errors

slide-22
SLIDE 22

EECS 768 Virtual Machines 22

Network Friendliness

  • Support dynamic class loading on

demand

– load classes only when needed – spread loading over time

  • Compact instruction encoding

– zero-address stack-based bytecode to reduce code size – contain signifjcant metadata

  • maybe a slight code size win over RISC fjxed-width

ISAs

slide-23
SLIDE 23

EECS 768 Virtual Machines 23

Java ISA

  • Formalized in classfjle specifjcation.
  • Includes instruction defjnitions

(bytecodes).

  • Includes data defjnitions and

interrelationships (metadata).

slide-24
SLIDE 24

EECS 768 Virtual Machines 24

Java Architected State

  • Implied registers

– program counter, local variable pointer, operand stack pointer, current frame pointer, constant pool base

  • Stack

– arguments, locals, and operands

  • Heap

– objects and arrays – implementation-dependent object representation

  • Class fjle content

– constant pool holds immediates (and other constant information)

slide-25
SLIDE 25

EECS 768 Virtual Machines 25

Data Items

  • T

ypes are defjned in specifjcation

– implementation free to choose representation – reference (pointers) and primitive (byte, int, etc.) types

  • Range of values that can be held are

given

– e.g., byte is between -127 and +128 – data is located via

  • references; as fjelds of objects in heap
  • ofgsets using constant pool pointer, stack pointer
slide-26
SLIDE 26

EECS 768 Virtual Machines 26

Data Accessing

  • pcode
  • pcode
  • perand
  • perand
  • pcode
  • perand
  • pcode
  • pcode
  • perand
  • perand
  • pcode
  • perand
  • pcode
  • pcode
  • perand

Operands

Locals

Object Object Object index implied index Array implied

HEAP

Instruction stream

STACK FRAME CONSTANT POOL

index

slide-27
SLIDE 27

EECS 768 Virtual Machines 27

Instruction Set

  • Bytecodes

– single byte opcode – zero or more operands

  • Can access operands

from

– instruction – current constant pool – current frame local variables – values on operand stack

  • pcode
  • pcode

index

  • pcode

index1 index2

  • pcode

data

  • pcode

data1 data2

slide-28
SLIDE 28

EECS 768 Virtual Machines 28

Instruction T ypes

  • Pushing constants onto the stack
  • Moving local variable contents to and from the

stack

  • Managing arrays
  • Generic stack instructions (dup, swap, pop & nop)
  • Arithmetic and logical instructions
  • Conversion instructions
  • Control transfer and function return
  • Manipulating object fjelds
  • Method invocation
  • Miscellaneous operations
  • Monitors
slide-29
SLIDE 29

EECS 768 Virtual Machines 29

Stack Tracking

  • At any point in program operand stack

has

– same number of operands – of same types – and in same order – regardless of the control path getting there !

  • Helps with static type checking
slide-30
SLIDE 30

EECS 768 Virtual Machines 30

Stack T racking – Example

  • Valid bytecode sequence:

iload A //push int. A from local mem. iload B //push int. B from local mem. If_cmpne 0 else // branch if B ne 0 iload C // push int. C from local mem. goto endelse else: iload F //push F endelse: add // add from stack; result to stack istore D // pop sum to D

slide-31
SLIDE 31

EECS 768 Virtual Machines 31

Stack T racking – Example

  • Invalid bytecode sequence

– stack at skip1 depends on control-fmow path

iload B // push int. B from local mem. If_cmpne 0 skip1 // branch if B ne 0 iload C // push int. C from local mem. skip1: iload D // push D iload E // push E if_cmpne 0 skip2 // branch if E ne 0 add // add stack; result to stack skip2: istore F // pop to F

slide-32
SLIDE 32

EECS 768 Virtual Machines 32

Exception T able

  • Exceptions identifjed by table in class

fjle

– address Range where checking is in efgect – target if exception is thrown

  • operand stack is emptied
  • If no table entry in current method

– pop stack frame and check calling method – default handlers at main

From To Target Type 8 12 96 Arithmetic Exception

slide-33
SLIDE 33

EECS 768 Virtual Machines 33

Binary Class Format

  • Magic number and

header

  • Regions preceded by

counts

– constant pool – interfaces – fjeld information – methods – attributes

Magic Number Version Information Constant Pool

  • Const. Pool Size

Access Flags This Class Super Class Interfaces Interface Count Field Information Field count Methods count Methods Attributes Count Attributes

slide-34
SLIDE 34

EECS 768 Virtual Machines 34

Java Virtual Machine

  • Abstract entity that gives meaning to

class fjles

  • Has many concrete implementations

– hardware – interpreter – JIT compiler

  • Persistence

– an instance is created when an application starts – terminates when the application fjnishes

slide-35
SLIDE 35

EECS 768 Virtual Machines 35

JVM Implementation

  • A typical JVM implementation consists of

– class loader subsystem , memory subsystem, emulation/execution engine, garbage collector

method area heap Java stacks native method stacks

Memory

Class Loader Subsystem

class files native method libraries

addresses data & instructions Execution Engine

PCs & implied regs native method interface

Garbage Collector

slide-36
SLIDE 36

EECS 768 Virtual Machines 36

Class Loader

  • Functions

– fjnd the binary class – convert class data into implementation- dependent memory image – verify correctness and consistency of the loaded classes

  • Security checks

– checks class magic number – component sizes are as indicated in class fjle – checks number/types of arguments – verify integrity of the bytecode program

slide-37
SLIDE 37

EECS 768 Virtual Machines 37

Protection Sandbox

Global Memory Objects with statically defined(fixed) types Local Storage Operand Storage Declared (fixed) types Tracked types Load: type determined from reference/field type Store: must be to reference and field with correct types Move to local storage: must be to a location with correct type Move to operand stroage: type determined from local storage type ALU tracked types Array loads are range checked Array stores are range checked

slide-38
SLIDE 38

EECS 768 Virtual Machines 38

Protection Sandbox: Security Manager

  • A trusted class containing check

methods

– attached when Java program starts – cannot be removed or changed

  • User specifjes checks to be made

– fjles, types of access, etc.

  • Operation

– native methods that involve resource accesses (e.g. I/O) fjrst call check method(s)

slide-39
SLIDE 39

EECS 768 Virtual Machines 39

Verifjcation

  • Class fjles are checked when loaded

– to ensure security and protection

  • Internal Checks

– checks for magic number – checks for truncation or extra bytes

  • each component specifjes a length

– make sure components are well-formed

slide-40
SLIDE 40

EECS 768 Virtual Machines 40

Verifjcation (2)

  • Bytecode checks

– check valid opcodes – perform full path analysis

  • regardless of path to an instruction contents of
  • perand stack must have same number and types
  • f items
  • checks arguments of each bytecode
  • check no local variables are accessed before

assigned

  • makes sure fjelds are assigned values of proper

type

slide-41
SLIDE 41

EECS 768 Virtual Machines 41

Java Native Interface (JNI)

  • Allows java code and native code to

interoperate

– access legacy code, system calls from Java – access Java API from native functions

  • see fjgure on next slide

– each side compiles to its own binary format – difgerent java and native stacks maintained – arguments can be passed; values/exceptions returned

slide-42
SLIDE 42

EECS 768 Virtual Machines 42

Java Native Interface (JNI)

Java HLL Program Compile and Load Bytecode Methods

  • bject
  • bject

array getfield/ putfield C Program Compile and Load Native Machine Code invoke native method Native Data Structures load/store

Java Side Native Side

JNI get/put

slide-43
SLIDE 43

EECS 768 Virtual Machines 43

Garbage Collector

  • Provides implicit heap object space

reclamation policy.

  • Collects objects that have all their

references removed or destroyed.

  • Invoked at regular intervals, or when

low on memory.

  • see fjgure on next slide

– root set point to objects in heap – objects not reachable from root set are garbage

slide-44
SLIDE 44

EECS 768 Virtual Machines 44

Garbage Collector (2)

. . .

Root Set Global Heap A B D C E F H G

slide-45
SLIDE 45

EECS 768 Virtual Machines 45

T ypes of Collectors

  • Reference count collectors

– keep a count of the number of references to each object

  • Tracing collectors

– using the root set of references

slide-46
SLIDE 46

EECS 768 Virtual Machines 46

Mark and Sweep Collector

  • Basic tracing collector

– start with root set of references – trace and mark all reachable objects – sweep through heap collecting marked

  • bjects
  • Advantages

– does not require moving object/pointers

  • Disadvantages

– garbage objects combined into a linked list

  • leads to fragmentation
  • segregated free-lists can be used
  • consolidation of free space can improve effjciency
slide-47
SLIDE 47

EECS 768 Virtual Machines 47

Compacting Collector

  • Make free space

contiguous

– multiple passes through heap – lot of object movement

  • many pointer updates

A B C D E F G H free A B C E G free

slide-48
SLIDE 48

EECS 768 Virtual Machines 48

Copying Collector

  • Divide heap into

halves

– collect when one half full – copy into unused half during sweep phase

  • Reduces passes

through heap

  • Wastes half the

heap

A B C D E F G H free unused A B C E G free unused

slide-49
SLIDE 49

EECS 768 Virtual Machines 49

Simplifying Pointer Updates

  • Add level of

indirection

– use handle pool – object moves update handle pool

  • Makes every
  • bject access slow

Global Heap

  • bject references

(e.g. on stack) Handle Pool Object Pool A B

slide-50
SLIDE 50

EECS 768 Virtual Machines 50

Generational Collectors

  • Reduce number of objects moved

during each collection cycle.

  • Exploit the bi-modal distribution of
  • bject lifetimes.
  • Divide heap into two sub-heaps

– nursery, for newly created objects – tenured, for older objects

  • Collect a smaller portion of the heap

each time.

slide-51
SLIDE 51

EECS 768 Virtual Machines 51

Generational Collectors (2)

  • Stop-the-world collectors

– time consuming, long pauses – unsuitable for real-time applications

slide-52
SLIDE 52

EECS 768 Virtual Machines 52

Concurrent Collectors (2)

. . .

Root Set A B D C

. . .

A B D C Root Set

  • GC concurrently with application execution

– partially collected heap may be unstable (see fjgure) – synchronization needed between the application (mutator) and the collector

slide-53
SLIDE 53

EECS 768 Virtual Machines 53

JVM Bytecode Emulation

  • Interpretation

– simple, fast startup, slow steady-state

  • Just-In-Time (JIT) compilation

– compile each method on fjrst invocation – simple optimizations, slow startup, fast steady-state

  • Hot-spot compilation

– compile frequently executed code – can apply more aggressive optimizations – moderate startup, fast steady-state