JIT Compilation Module Overview JIT Compilation Native vs. Managed - - PowerPoint PPT Presentation

jit compilation module overview
SMART_READER_LITE
LIVE PREVIEW

JIT Compilation Module Overview JIT Compilation Native vs. Managed - - PowerPoint PPT Presentation

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed Execution Phases Assembly Loading & Initialization JIT Compilation JIT Optimizations Whats new in NGEN 4.0? When to use NGEN? 2 Running Code


slide-1
SLIDE 1

JIT Compilation

slide-2
SLIDE 2

Module Overview

JIT Compilation

Native vs. Managed Compilation Managed Execution Phases Assembly Loading & Initialization JIT Compilation JIT Optimizations What’s new in NGEN 4.0? When to use NGEN?

2

slide-3
SLIDE 3

Running Code

Behavior in Windows 2000

  • Legacy entry point mscoree!CorExeMain gets used

Behavior in Windows XP

  • The operating system loader checks for managed modules by examining a bit

in the common object file format (COFF) header

  • The bit being set denotes a managed module
  • If the loader detects managed modules, it loads mscoree.dll, and

clr!CorValidateImage and clr!CorImageUnloading notify the loader when the managed module images are loaded and unloaded

clr!CorValidateImage performs the following:

  • Ensures that the code is valid managed code
  • Changes the entry point in the image to an entry point in the runtime

On 64-bit Windows, _CorValidateImage modifies the image that is in memory by transforming it from PE32 to PE32+

3

slide-4
SLIDE 4

Native compile vs Managed compile

Simplified view of native code compilation

.CPP or .C file containing C or C++ code Compile .OBJ File (Machine language) .EXE or .DLL File (Machine language) Link

Native Code

Type describing information Type describing information

4

slide-5
SLIDE 5

Native compile vs Managed compile

Simplified view of managed code compilation

.CS File containing C# code Compile Assembly(.EXE or .DLL) containing MSIL and Metadata Machine Language generated in memory at runtime by JIT compiler Execute

Managed Code

5

slide-6
SLIDE 6

.NET Source code IL and Metadata

Managed Execution Phases

Native Code Compile Time Run Time

Phase

.NET Source code

IL and Metadata

IL and Metadata

Native Code

JIT (CLR) Compile

6

slide-7
SLIDE 7

JIT Compilation

What does JIT do?

Checks if function is called 1st time

  • JIT compiles IL code to native code if true

Stores native code in memory Updates MethodDescriptor field

  • Reference updated to point to memory location of native code

7

slide-8
SLIDE 8

Execute the native code

MyModule::Main First call? CILJit::compileMethod (x86) PreJit::compileMethod (x64) [verifies and compiles the IL] Native code, GCInfo, EH data,etc. Store native code in memory Store the address in MethodDesc

Execute the native code

Yes

Retrieve address of native code from MethodDesc

No

Managed Execution

JIT Compilation

8

slide-9
SLIDE 9

MethodDescriptor

Contains implementation of a managed method Generated as part of the class loading procedure Initially points to IL Code Can be determined during debugging

  • !SOS.DumpMD <MethodDesc address>

0:004> !dumpmd 009969a8

Method Name: MyApp.MainForm.menu_Click(System.Object, System.EventArgs) Class: 00cd5c0c MethodTable: 00996ad4 mdToken: 0600028d Module: 00992c3c

IsJitted: no CodeAddr: ffffffff

Transparency: Critical

9

slide-10
SLIDE 10

Process starts and loads the .NET Framework

MSCoreEE.dll is loaded Process Main thread starts executing by calling _CorExeMain Initializes CLR Reads MetaData tables Build InMemory representation ClassLoader is called JIT compile the Main method Execute Main

MethodTable & EEClass

Managed Execution

Assembly Loading and Initialization

10

slide-11
SLIDE 11

Anatomy of a managed non-value instance (very schematic)

EEClass*

Method Desc* “Hot” metadata VTables

Full (“Cold”) metadata MethodTable

EEClass MethodDesc

Loader Heap Pointer to

  • PreJittedStub
  • JItted code
  • “IL” stub

MethodTable*

GC Heap

Field layout

Instance

* == Pointer

11

slide-12
SLIDE 12

!DumpClass & DumpMT

12

slide-13
SLIDE 13

JIT Optimizations

Summary

Types of Optimization:

  • JIT Inlining
  • JIT Tail Calls

13

slide-14
SLIDE 14

JIT Optimizations

Tail Calls

When the last thing a function does is call another function

  • Calls without Optimization:
  • With Optimization:
  • will use the same stack space as the caller.
  • improve data locality, memory usage,

and cache usage. static public void Main() { Helper(); } static public void Helper() { One(); Two(); Three(); } static public void Three() { ... } Call One() Call Two() Call Three() Call One() Call Two() Jump Three()

14

slide-15
SLIDE 15

JIT Optimizations

Tail Calls

Tail Call Feature set different between X86 and X64

  • Can lead to e.g. Stack Overflow Exception on X86 Debug but works fine on

X64 ( where x86 stack just enough in release with tail call, but overflows without optimization)

No Tail Calls possible if:

  • Caller doesn't return immediately after the call
  • Stack arguments between caller and callee are incompatible in a way that

would require shifting things around in the caller's frame before the callee could execute

  • Caller and callee return different types
  • We inline the call instead (inlining is way better than tail calling, and opens the

door to many more optimizations)

  • Security issues
  • The debugger / profiler turned off JIT optimizations
  • Full list see: .NET 2.0 Tail limits and .NET 4.0 Tail Limits

15

slide-16
SLIDE 16

JIT Optimizations – Inlining

Without inlining With inlining

class Test { static int And(int i1, int i2) { return i1 & i2; } static int i; static public void Main() { i = And(i, 0); } } class Test { static int i; static public void Main() { i = 0 // xor edx,edx } }

16

slide-17
SLIDE 17

JIT Optimizations – Inlining

Without Inlining:

Main() <Setup stack> mov ecx,dword ptr ds:[183368h] ; setup first argument (i) xor edx,edx ; setup second argument (0) call dword ptr ds:[183818h] (Inline.Program+Test.And(Int32, Int32), mdToken: 06000002) ; Call And(…) mov dword ptr [ebp-4],eax ; save return value mov eax,dword ptr [ebp-4] ; assign result to static mov dword ptr ds:[00283368h],eax ; assign result to static <cleanup stack> ret ; return And(Int32, Int32) <Setup stack> mov eax,dword ptr [ebp-4] ; move arg 1 to eax and eax,dword ptr [ebp-8] ; Add argument 2 to eax (return register) <cleanup stack> ret ; return to caller

17

slide-18
SLIDE 18

JIT Optimizations – Inlining

With Inlining

  • And(.,.) is inlined now
  • No add reg,reg - because not needed (argument is 0)

MethodDesc Table Entry MethodDesc JIT Name 53dda7e0 53bb4934 PreJIT System.Object.ToString() 53dde2e0 53bb493c PreJIT System.Object.Equals(System.Object) 53dde1f0 53bb495c PreJIT System.Object.GetHashCode() 53e61600 53bb4970 PreJIT System.Object.Finalize() 001dc019 001d3828 NONE Inline.Program+Test..ctor() 001dc011 001d3810 NONE Inline.Program+Test.And(Int32, Int32) 00270070 001d381c JIT Inline.Program+Test.Main() Main() xor edx,edx ; generate final result mov dword ptr ds:[1D3368h],edx ; move result to static ret ; return

18

slide-19
SLIDE 19

Demo: JIT Compilation

!dumpmt –md bp cmdStartJit_Click

slide-20
SLIDE 20

JIT Optimizations

Additional Config

Instruct CLR not to optimize the code (during jit) without recompiling the dll:

  • Use an ini file (and symbols)
  • MyDll.ini:

[.NET Framework Debugging Control] GenerateTrackingInfo=1 (per default on up from .NET 2.0) AllowOptimize=0

  • (useable for GAC as well)

Instruct CLR to ignore (optimized) Ngen Image

  • Use Environment variable: set COMPLUS_ZapDisable=1

20

slide-21
SLIDE 21

JIT Performance Counters - % Time in Jit

% elapsed time in JIT compilation since JIT started Updated at the end of every JIT compilation phase. A JIT compilation phase occurs when a method and its dependencies are compiled. A value > 5% can indicate a problem

  • Is Ngen an option?
  • http://msdn.microsoft.com/en-us/magazine/cc163610.aspx
  • Do you use multiple AppDomains?
  • loading assemblies as domain neutral can help
  • Minimize the classes and assemblies within code path
  • Use code coverage to determine these components.
  • See .NET Framework Usage Performance Rules/DA0009

21

slide-22
SLIDE 22

JIT Performance Counters -summary

Performance counter Description

# of IL Bytes JITted Displays the total number of Microsoft intermediate language (MSIL) bytes compiled by the just-in-time (JIT) compiler since the application started. This counter is equivalent to the Total # of IL Bytes Jitted counter. # of IL Methods JITted Displays the total number of methods JIT-compiled since the application started. This counter does not include pre-JIT-compiled methods. % Time in Jit Displays the percentage of elapsed time spent in JIT compilation since the last JIT compilation phase. This counter is updated at the end of every JIT compilation phase. A JIT compilation phase occurs when a method and its dependencies are compiled. IL Bytes Jitted / sec Displays the number of MSIL bytes that are JIT-compiled per second. This counter is not an average over time; it displays the difference between the values observed in the last two samples divided by the duration of the sample interval. Standard Jit Failures Displays the peak number of methods the JIT compiler has failed to compile since the application started. This failure can occur if the MSIL cannot be verified or if there is an internal error in the JIT compiler. Total # of IL Bytes Jitted Displays the total MSIL bytes JIT-compiled since the application started. This counter is equivalent to the # of IL Bytes Jitted counter.

22

slide-23
SLIDE 23

JIT Performance Counters - % Time in Jit

% elapsed time in JIT compilation since JIT started Updated at the end of every JIT compilation phase. A JIT compilation phase occurs when a method and its dependencies are compiled. A value > 5% can indicate a problem

  • Is Ngen an option?
  • http://msdn.microsoft.com/en-us/magazine/cc163610.aspx
  • Do you use multiple AppDomains?
  • loading assemblies as domain neutral can help
  • Minimize the classes and assemblies within code path
  • Use code coverage to determine these components.
  • See .NET Framework Usage Performance Rules/DA0009

23

slide-24
SLIDE 24

JIT Performance Counters -summary

Performance counter Description

# of IL Bytes JITted Displays the total number of Microsoft intermediate language (MSIL) bytes compiled by the just-in-time (JIT) compiler since the application started. This counter is equivalent to the Total # of IL Bytes Jitted counter. # of IL Methods JITted Displays the total number of methods JIT-compiled since the application started. This counter does not include pre-JIT-compiled methods. % Time in Jit Displays the percentage of elapsed time spent in JIT compilation since the last JIT compilation phase. This counter is updated at the end of every JIT compilation phase. A JIT compilation phase occurs when a method and its dependencies are compiled. IL Bytes Jitted / sec Displays the number of MSIL bytes that are JIT-compiled per second. This counter is not an average over time; it displays the difference between the values observed in the last two samples divided by the duration of the sample interval. Standard Jit Failures Displays the peak number of methods the JIT compiler has failed to compile since the application started. This failure can occur if the MSIL cannot be verified or if there is an internal error in the JIT compiler. Total # of IL Bytes Jitted Displays the total MSIL bytes JIT-compiled since the application started. This counter is equivalent to the # of IL Bytes Jitted counter.

24

slide-25
SLIDE 25

.NET 4.5: Enabling Multi-Core Background JIT

using public static void /*Sets Application Profiler Path */ /* Starts JIT on multicore Systems Starts process of recording current method use, which later overwrites the specified profile file. */

25

slide-26
SLIDE 26

.NET 4.5: Multi-Core Background JIT

ProfileOptimization.StartProfile("MyApp.Scenario1"); … ProfileOptimization.StartProfile("MyApp.Scenario2"); …. ProfileOptimization.StartProfile("MyApp.Scenario3");

http://support.microsoft.com/kb/2715214/en-us

26

slide-27
SLIDE 27

.NET 4.6 - RyuJIT

Better througput (25% for Bing) Disable:

<configuration> <runtime> <useLegacyJit enabled=“1"/> </runtime> </configuration>

27

slide-28
SLIDE 28

Demo: JIT Optimizations

AllowOptimize

slide-29
SLIDE 29

ETW Architecture

Controllers Log Files

Session

1

Session 2 Session 64

Event Tracing Sessions

Providers

Consumers

… Events

Events Events

29

slide-30
SLIDE 30

JIT ETW tracing in .NET Framework 4

Gives Information about Inlining or tail-calling a certain method

  • Register .NET Events:
  • Start ETW tracing (JIT events on: 1010)
  • Stop ETW Tracing and view Trace with Xperf

wevtutil im ….\v4.0.21006\clr-etw.man Xperf –on base xperf -start Jit -on e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x1010:5

  • f JIT.etl

<Start of my Application> xperf –stop Jit Xperf –d base.etl Xperf –merge Jit.etl base.etl merge.etl Wpa merge.etl

30

slide-31
SLIDE 31

.NET ETW Events -summary

Runtime keyword name Value Purpose GCKeyword 0x00000001 Enables the collection of garbage collection events. LoaderKeyword 0x00000008 Enables the collection of loader events. JITKeyword 0x00000010 Enables the collection of just-in-time (JIT) events. NGenKeyword 0x00000020 Enables the collection of events for native image methods (methods processed by the Native Image Generator, Ngen.exe); used with StartEnumerationKeyword and EndEnumerationKeyword. StartEnumerationKeyword 0x00000040 Enables the enumeration of all the methods in the runtime; used in conjunction with NGenKeyword. EndEnumerationKeyword 0x00000080 Enables the enumeration of all the methods destroyed in the runtime; used in conjunction with JITKeyword and NGenKeyword. SecurityKeyword 0x00000400 Enables the collection of security events. AppDomainResourceManagementK eyword 0x00000800 Enables the collection of resource monitoring events at an application domain level.

JITTracingKeyword 0x00001000 Enables the collection of JIT tracing events.

InteropKeyword 0x00002000 Enables the collection of interop events. ContentionKeyword 0x00004000 Enables the collection of contention events. ExceptionKeyword 0x00008000 Enables the collection of exception events. ThreadingKeyword 0x00010000 Enables the collection of threadpool events. StackKeyword 0x40000000 Enables the collection of CLR stack trace events

31

slide-32
SLIDE 32

Demo: JIT ETW Events

Wevtutil im CLR-ETW.man xperf

slide-33
SLIDE 33

.NET Decompilation

Managed decompilers

  • Output source code and IL
  • Source output available in multiple languages

Obfuscation can make it more difficult to decompile source code Many different options

33

slide-34
SLIDE 34

PreJit / NGEN

.NET 2.0 - 3.5

Ngen is calling a Service (LocalSystem). This service compiles the image within background. Support compilation of all dependent DLLs and update functionality Metadata is now included within created dll/exe

Ngen MyNiceExe.EXE

mscorsvw.exe JIT

C:\WINDOWS\assembly\ NativeImages_v2.0.50215_32 \MyExe\MyExe.exe

34

slide-35
SLIDE 35

NGEN 4.0

Side by Side support Ngen.exe now compiles assemblies with full trust, CAS policy is no longer evaluated. Native images that are generated with Ngen.exe can no longer be loaded into applications that are running in partial trust. Located in folder Framework\v4.0.xxxxx Supports .NET 4.0 and .NET 2.0 assemblies

  • Generates 2.0 image
  • ngen.exe install <2.0 assembly>
  • Generates 4.0 image
  • ngen.exe install <2.0 assembly> /ExeConfig:<Path to a 4.0 EXE>

OR

  • ngen.exe install <2.0 EXE with a config file that indicates 4.0 as the

preferred runtime>

35

slide-36
SLIDE 36

NGEN 4.0

Target Patching In .NET 2 - 3.5, if Assembly Y depends on X then CLR re-ngen Y for any change in X, because

  • Y may inline methods from X
  • Y may use fields in X’s classes (layout of classes might change)
  • Y may derive X’s classes (layout of classes might change)

BUT ~half of changes only modify bodies of large methods

  • Large methods not inlined cross-assembly
  • No need to re-NGEN

– if only function bodies changed (unless function prototype changed)

  • Works great for QFEs and GDRs (small security fixes)
  • Unlikely to work for a service pack

36

slide-37
SLIDE 37

.NGEN 4.0

Priotization of NGEN

Priority 1 images compiles on all cores

  • ngen.exe install /queue:1 <MyImportantAssembly#1>
  • ngen.exe install /queue:1 <MyImportantAssembly#2>

Priority 3 images compile at idle time

  • ngen.exe install /queue:3 <MyAssembly#N+1>

37

slide-38
SLIDE 38

.NGEN 4.5

Starting from windows 8 and .NET framework 4.5 Native images will be created Automatically by Auto NGen Maintenance Task Images will be created based on “Assembly Usage Logs” created by the application in the AppData windows directory Auto NGEN Maintenance Task is based on Automatic Maintenance that runs at background when the machine is idle Auto NGEN Maintenance Task also reclaiming native images that are not in use anymore

38

slide-39
SLIDE 39

.NGEN 4.5 Notes

The assembly must targets the .NET Framework 4.5 The Auto NGen runs only on Windows 8 and above For Desktop apps the Auto NGen applies only to GAC assemblies For Modern Style Apps Auto NGen applies to all assemblies Auto NGen will not remove not used rooted native images (Images NGened by the developers).

39

slide-40
SLIDE 40

.NET 4.5: Managed Profile Guided Optimization

IL Assembly MPGO IL Assembly with embedded training profile NGEN Optimized precompiled native Image

MPGO co-locates frequently used image data within a native image reduce the number of pages loaded from disk.

  • Less page faults

reduce the number of copy-on-write pages. Improves startup time, memory usage (All apps)

40

slide-41
SLIDE 41

.NET 4.5: How to MPGO

1. Run the MPGO tool (as an administrator) with the necessary parameters. The optimized IL assemblies are created in the C:\Optimized folder. 2. Run the NGen tool (as an administrator) with the necessary parameters for each application DLL:

MPGO -scenario MyLargeApp.exe -AssembyList *.* -OutDir C:\Optimized\

NGEN.exe c:\Optimized\myLargeApp.exe

slide-42
SLIDE 42

.NET 4.6 - Ngen

Better througput (25% for Bing) Disable for specific assemblies

<configuration> <runtime> <disableNativeImageLoad> <assemblyIdentity name="assembly_one" /> <assemblyIdentity name="assembly_two" /> </disableNativeImageLoad> </runtime> </configuration>

42

slide-43
SLIDE 43

If DLL used in several loaded processes If multiple instances of application started e.g. Terminal Server Be sure to set the base address of your assemblies correctly Rebasing DLL during load impacts perf & prevents sharing image

When to use Ngen?

43

slide-44
SLIDE 44

ASLR

/DYNAMICBASE

Address Space Layout Randomization

  • Up from Vista and Windows 2008 Server
  • Comes with .NET 3.5 Sp1
  • C++: use DynamicBase

Rebasing

  • done within the kernel
  • Pages still shareable
  • Backed up by the image – not the page file
  • Base Address is no issue anymore
  • Rebuild your app with .NET 3.5 sp1!

44

slide-45
SLIDE 45

Review

  • 1. What is the benefit of using NGEN?
  • 2. What is tail optmization?
  • 3. What tool can be used for viewing

ETW files?

45

slide-46
SLIDE 46

Reference

.NET Code Generation Blog

  • http://blogs.msdn.com/b/clrcodegeneration/

The Performance Benefits of NGen.

  • http://msdn.microsoft.com/en-us/magazine/cc163610.aspx

JIT ETW tracing in .NET Framework 4

  • http://blogs.msdn.com/b/clrcodegeneration/archive/2009/05/11/jit-etw-tracing-

in-net-framework-4.aspx

46