Leveraging MPST in Linux with Application Guidance to Achieve Power - - PowerPoint PPT Presentation

leveraging mpst in linux with
SMART_READER_LITE
LIVE PREVIEW

Leveraging MPST in Linux with Application Guidance to Achieve Power - - PowerPoint PPT Presentation

Leveraging MPST in Linux with Application Guidance to Achieve Power and Performance Goals Michael R. Jantz 1 , Kshitij A. Doshi 2 , Prasad A. Kulkarni 1 , and Heechul Yun 1 1 University of Kansas, Lawrence, Kansas 2 Intel Corporation, Chandler,


slide-1
SLIDE 1

Leveraging MPST in Linux with Application Guidance to Achieve Power and Performance Goals

Michael R. Jantz1, Kshitij A. Doshi2, Prasad A. Kulkarni1, and Heechul Yun1

1 University of Kansas, Lawrence, Kansas 2 Intel Corporation, Chandler, Arizona

1

slide-2
SLIDE 2

Introduction

  • Memory has become a significant player in power and performance
  • Memory power management is challenging
  • Propose a collaborative approach between applications, operating

system, and hardware: – Applications – insert instructions to communicate to OS memory usage intent – OS – re-architect memory management to interpret application intent and manage memory over hardware units – Hardware – communicate hardware layout to the OS to guide memory management decisions

  • Implemented framework by re-architecting recent Linux kernel
  • Experimental evaluation with industrial-grade JVM

2

slide-3
SLIDE 3

Why

  • CPU and Memory are most significant players for power and performance

– In servers, memory power == 40% of total power [1]

  • Applications can direct CPU usage

– threads may be affinitized to individual cores or migrated b/w cores – prioritize threads for task deadlines (with nice) – individual cores may be turned off when unused

  • Surprisingly, much of this flexibility does not exist for controlling memory

3

slide-4
SLIDE 4

Example Scenario

  • System with database workload with

512GB DRAM – All memory in use, but only 2% of pages are accessed frequently – CPU utilization is low

  • How to reduce power

consumption?

4

slide-5
SLIDE 5

Challenges in Managing Memory Power

  • Memory refs. have temporal and spatial variation
  • At least two levels of virtualization:

– Virtual memory abstracts away application-level info – Physical memory viewed as single, contiguous array of storage

  • No way for agents to cooperate with the OS and with

each other

  • Lack of a tuning methodology

5

slide-6
SLIDE 6

A Collaborative Approach

  • Our approach: enable applications to guide mem. mgmt.
  • Requires collaboration between the application, OS, and

hardware: – Interface for communicating application intent to OS – Ability to keep track of which memory hardware units host which physical pages during memory mgmt.

  • To achieve this, we propose the following abstractions:

– Colors – Trays

6

slide-7
SLIDE 7

Communicating Application Intent with Colors

  • Color = a hint for how pages will be used

– Colors applied to sets of virtual pages that are alike – Attributes associated with each color

  • Attributes express different types of distinctions:

– Hot and cold pages (frequency of access) – Pages belonging to data structures with different usage patterns

  • Allow applications to remain agnostic to lower level

details of mem. mgmt.

7

Software Intent Color Tray Memory Allocation and Freeing

slide-8
SLIDE 8

Power-Manageable Units Represented as Trays

  • Tray = software structure containing sets of pages

that constitute a power-manageable unit

  • Requires mapping from physical addresses to

power-manageable units

  • ACPI 5.0 memory power state table (MPST):

– Phys. address ranges --> mem. hardware units

Software Intent Color Tray Memory Allocation and Freeing

8

slide-9
SLIDE 9

Coloring Example

  • Application with two distinct sets of memory

– Large set of infrequently accessed (cold) memory – Small set of frequently accessed (hot) memory

  • Specify guidance as a set of standard intents

– MEM-INTENSITY (hot or cold) – MEM-CAPACITY (% of dynamic RSS)

  • Intents enable OS to manage mem. more efficiently

– Save power by co-locating hot / cold memory – Recycle large span of cold pages more aggressively

9

slide-10
SLIDE 10

Configuration File to Specify Intents

10

# Specification for frequency of reference: INTENT MEM-INTENSITY # Specification for containing total spread: INTENT MEM-CAPACITY # Mapping to a set of colors: MEM-INTENSITY RED 0 // hot pages MEM-CAPACITY RED 5 // hint - 5% of RSS MEM-INTENSITY BLUE 1 // cold pages MEM-CAPACITY BLUE 3 // hint - 3% of RSS

  • Associate colors with intents in configuration files
  • Parses config file to create and structure data passed to

the OS

slide-11
SLIDE 11

Memory Coloring System Calls

System Call Arguments Description mcolor addr, size, color Applies color to a virtual address range

  • f length size starting at addr

get_addr_mcolor addr, *color Returns the current color of the virtual address addr set_mcolor_attr color, *attr Associates the attribute pointed to by attr with color get_mcolor_attr color, *attr Returns the attribute currently associated with color

11

  • Specify colors / intents using system calls
  • Use mcolor, set_mcolor_attr to color application pages
slide-12
SLIDE 12

Memory Management in Linux

12

  • Default Linux kernel organizes physical memory hierarchically

– Nodes --> zones --> lists of physical pages (free lists, LRU lists)

  • Distinction for pages on different nodes, but not different ranks

Memory management in the default Linux kernel

slide-13
SLIDE 13

Tray Implementation

13

  • Trays exist as a division between zones and physical pages
  • Each tray corresponds to a rank, maintains its own lists of pages
  • Kernel memory mgmt. routines modified to operate over trays

Memory management with tray structures in our modified Linux kernel

Node 0 Z

  • ne Normal

Z

  • ne DMA

Tray 1

free LR U

Tray 0

free LR U

Tray 1

free LR U

Tray 2

free LR U

Tray 3

free LR U

Node 1 Z

  • ne Normal

Tray 5

free LR U

Tray 6

free LR U

Tray 7

free LR U

Tray 4

free LR U

R ank 0 R ank 1

Memory controller C hannel 0

R ank 2 R ank 3

Memory controller C hannel 1

R ank 0 R ank 1

Memory controller C hannel 0

R ank 2 R ank 3

Memory controller C hannel 1

NUMA Node 0 NUMA Node 1 Memory Hardware Operating S ystem

slide-14
SLIDE 14

Evaluation

  • Emulating NUMA API’s
  • Enabling power consumption proportional to

the active footprint

14

slide-15
SLIDE 15

Emulating NUMA API’s

  • Modern server systems include API for managing memory
  • ver NUMA nodes
  • Our goal: demonstrate that framework is flexible and efficient

enough to emulate NUMA API functionality

  • Experimental Setup

– Oracle’s HotSpot JVM includes optimization to improve DRAM access locality (implemented w/ NUMA API’s) – Modified HotSpot to control memory placement using mem. coloring – Compare performance with the default configuration and with

  • ptimization implemented w/ NUMA API’s and w/ memory coloring

15

slide-16
SLIDE 16

Memory Coloring Emulates the NUMA API

0.2 0.4 0.6 0.8 1 1.2

Performance of NUMA optim. relative to default Benchmarks NUMA API

  • mem. color API

16

  • Performance of SciMark 2.0 benchmarks with “NUMA-optimized” HotSpot

implemented with (1) NUMA API’s and (2) memory coloring framework

  • Performance is similar for both implementations
slide-17
SLIDE 17

Memory Coloring Emulates the NUMA API

10 20 30 40 50 60 70 80 90 100

% memory reads satisfied by local DRAM Benchmarks default NUMA API

  • mem. color API

17

  • % of memory reads satisfied by NUMA-local DRAM for SciMark 2.0

benchmarks with each HotSpot configuration.

  • Performance with each implementation is (again) roughly the same
slide-18
SLIDE 18
  • Our goal: demonstrate potential of our custom

kernel to reduce power in memory

  • Experimental setup:

– Custom workload that incrementally increases memory usage in 2GB steps – Compare three configurations on single node of server machine with 16GB of RAM

  • Default kernel with physical address interleaving
  • Default kernel with no interleaving
  • Custom kernel with tray-based allocation

18

Enabling Power Consumption Proportional to the Active Footprint

slide-19
SLIDE 19

Enabling Power Consumption Proportional to the Active Footprint

2 4 6 8 10 12 14 16 2 4 6 8 10 12

  • Avg. DRAM power

consumption (in W) Memory activated by scale_mem (in GB)

Default kernel (interleaving enabled) Default kernel (interleaving disabled) Power efficient custom kernel

19

  • Default kernel yields high power consumption even with small footprint
  • Custom kernel – tray-based allocation enables power consumption

proportional to the active footprint

slide-20
SLIDE 20

Future Improvements

  • Problems:

– Little understanding of which colors or coloring hints will be most useful for existing workloads – All colors and hints must be manually inserted

  • Developing a set of tools to profile, analyze and control

memory usage for applications

  • Capabilities we are working on:

– Detailed memory usage feedback over colored regions – On-line techniques to adapt guidance to feedback – Compiler / runtime integration to automatically partition and color address space based on profiles of memory usage activity

20

slide-21
SLIDE 21

Conclusion

  • A critical first step in meeting the need for a fine-grained,

power-aware flexible provisioning of memory.

  • Initial implementation demonstrates value

– But there is much more to be done

  • Questions?

21

slide-22
SLIDE 22

References

1.

  • C. Lefurgy, K. Rajamani, F. Rawson, W. Felter, M. Kistler, and T. W. Keller. Energy

management for commercial servers. Computer ,36 (12):39–48, Dec. 2003

22

slide-23
SLIDE 23

Backup

23

slide-24
SLIDE 24

Default Linux Kernel

Pages of different types Frequently referenced Infrequently referenced Application Problem Operating system does not see a distinction between:

  • different types of pages from the application
  • different units of memory that can be independently power managed

ranks Node’s Memory

24

slide-25
SLIDE 25

Custom Kernel with Memory Containerization

Pages of different types Node’s Memory Frequently referenced Infrequently referenced Application Note: not drawn to scale- 106 4kB pages can be contained in a 4GB DIMM Self refresh (idle) state More power management Less power management

25

slide-26
SLIDE 26

Analysis to Automatically Generate Memory Coloring Hints

26

  • Advantages to memory coloring:

– Broad spectrum of hints can be overlapped – Hints can adapt to changes in the system

  • Specific tasks

– Build post-processing to search profiling data for regions to color – Construct analysis to relate objects that should be colored to source code – Manually insert coloring hints into application to apply ideal guidance and evaluate its impact

slide-27
SLIDE 27

Novel System Tools

  • Memory usage statistics over colored regions

– Similar to /proc tools that enable users to query system- wide or per-application memory usage – Example: monitor page faults over a particular data structure – Will further improve memory usage guidance

  • Monitoring memory usage over trays

– Benefits applications such as whole-system virtualization – Provide user-level access to trays through /proc

27

slide-28
SLIDE 28

More Workloads and Usage Scenarios

  • Evaluate approach with complex, multi-tier

workloads at the realistic scale of server systems

– Potential applications: open source database, web server, J2EE software packages

  • Explore maximizing performance by distributing

high-value data widely across memory channels

  • Hints for expected access patterns

– Application guided read ahead or fault ahead with structures with expected sequential access

  • Different page recycling policies for trays

28