Dynamic Large Pages Dave Hansen IBM Linux Technology Center Why - PowerPoint PPT Presentation

Dynamic Large Pages Dave Hansen IBM Linux Technology Center

Why Large Pages?  Fewer objects to manage  Fit more objects in CPU caches  Per-page operations become cheaper  Per-page structures become “smaller”  Any cache miss is increasingly expensive  They are “special” on Linux The Linux Foundation Confidential 2

“Old” Workloads  Performance is critical  Grew out of HPC/DB space  Large Memory Footprint  Page-level handling (faults, etc...)  Willing to work around usability  mlock() tolerance  “Custom” Applications The Linux Foundation Confidential 3

State of the Art  Interfaces: fs / SHM / libhugetlbfs  Faulting replaces preallocation  COW support for private at fork()  Reservations  Quota Support  NUMA Policy Awareness  Lumpy Reclaim  Gigantic Pages The Linux Foundation Confidential 4

Linux VM Support  NUMA  Delayed allocation- better locality  Round-robin pool population  Deterministic COW support  Needed for MAP_PRIVATE support  MAP_PRIVATE needed for transparent replacement  Lumpy Reclaim The Linux Foundation Confidential 5

Admin Interfaces • Display  /proc/meminfo (incomplete)  /sys/kernel/mm/hugepages • Configuration  hugepages=  /proc/sys/vm/nr_hugepages (r/w)  Not 100% dependable  kernelcore= • hugeadm – wrapper for all of these

Multiple HW Sizes  ppc64: 4k, 64k, 16M, 16GB*  x86: 4k, 2M/4M, 1GB*  ia64: everything  parisc: 4M  s390: 4k, 1M  sh: 64k, 256k, 1M, 4M, 64M, 512M  sparc: 64k, 512k, 4M * Gigantic Page size Base page size option The Linux Foundation Confidential 7

Gigantic Pages  amd64: 1GB  powerpc: 16GB  Early allocation is required  before power on – ppc  boot-time – x86  Separate pools from regular page allocation and other huge pages The Linux Foundation Confidential 8

Multi-size support  Compile time selection?  Gigantic mean no one-size-fits-all approach can possibly work  sysfs interfaces  enumerate/allocate  Permit multiple mounts  Separate allocation pools The Linux Foundation Confidential 9

Virtualization - KVM  Large memory use  mlock()  Custom app, willing to modify  Performance concerns...  TLB miss 5x cost, with new h/w  Perfect huge page application! The Linux Foundation Confidential 10

Caveats  Fragmentation  Locked into memory – no reclaim  Hardware must be dedicated  Separate, discrete interfaces  Permissions  Amplification of bad NUMA placement decisions  Architecture TLB weakness The Linux Foundation Confidential 11

Candidate Users • Large, contiguous memory users • Poor temporal or spatial locality • Bottlenecks on fault speed • Pagetable size overhead  Large shared mapping • Pagetable cache footprint The Linux Foundation Confidential 12

Application Work • Using SHM? Add SHM_HUGETLB • libhugetlbfs  Drop-in replacement for malloc()/shmget()  Works for complex apps like firefox!  Link normally or use LD_PRELOAD  Executables in huge pages  Administraton with hugeadm The Linux Foundation Confidential 13

Future Work • User Stacks • Transparent promotion/demotion • Continuing improvements in page reclamation • Power management / Memory Hotplug • libhugetlbfs  documentation/usability The Linux Foundation Confidential 14

Further Reading  Cost of Pagetable lookups in virtual machines:  http://www.amd64.org/fileadmin/user_upload/pub/p26-bhargava.pdf  http://sourceforge.net/projects/libhugetlbfs/  http://www.ibm.com/developerworks/wikis/display/LinuxP/libhugetlbfs+FAQs The Linux Foundation Confidential 15

04/09/09 Click to add title Dynamic Large Pages Dave Hansen IBM Linux Technology Center 1

04/09/09 Why Large Pages?  Fewer objects to manage  Fit more objects in CPU caches  Per-page operations become cheaper  Per-page structures become “smaller”  Any cache miss is increasingly expensive  They are “special” on Linux The Linux Foundation Confidential The Linux Foundation Confidential 2 2 It costs the same number of cpu cycles more or less to do a large page minor fault or a small page one. But, the benefits of a large page fault are much higher. smaller in terms of percentage. A fixed N-byte object becomes relatively much smaller when the M-byte page it represents gets larger 'expensive' in terms of performance. CPUs are bottlenecked on memory bandwidth and caches are continuing to increase in their importance. 2

04/09/09 “Old” Workloads  Performance is critical  Grew out of HPC/DB space  Large Memory Footprint  Page-level handling (faults, etc...)  Willing to work around usability  mlock() tolerance  “Custom” Applications The Linux Foundation Confidential The Linux Foundation Confidential 3 3 There are classic workloads that have used large pages not necessarily the ones where they best fit 3

04/09/09 State of the Art  Interfaces: fs / SHM / libhugetlbfs  Faulting replaces preallocation  COW support for private at fork()  Reservations  Quota Support  NUMA Policy Awareness  Lumpy Reclaim  Gigantic Pages The Linux Foundation Confidential The Linux Foundation Confidential 4 4 4

04/09/09 Linux VM Support  NUMA  Delayed allocation- better locality  Round-robin pool population  Deterministic COW support  Needed for MAP_PRIVATE support  MAP_PRIVATE needed for transparent replacement  Lumpy Reclaim The Linux Foundation Confidential The Linux Foundation Confidential 5 5 COW usage used to give random app behavior. Now we can at least guarantee that parents will keep their huge pages and children have an opportunity to to get their own copies, too. 5

Admin Interfaces • Display  /proc/meminfo (incomplete)  /sys/kernel/mm/hugepages • Configuration  hugepages=  /proc/sys/vm/nr_hugepages (r/w)  Not 100% dependable  kernelcore= • hugeadm – wrapper for all of these The Linux Foundation Confidential 6

04/09/09 Multiple HW Sizes  ppc64: 4k, 64k, 16M, 16GB*  x86: 4k, 2M/4M, 1GB*  ia64: everything  parisc: 4M  s390: 4k, 1M  sh: 64k, 256k, 1M, 4M, 64M, 512M  sparc: 64k, 512k, 4M * Gigantic Page size Base page size option The Linux Foundation Confidential The Linux Foundation Confidential 7 7 just an indicator of why we need hstates so badly 7

04/09/09 Gigantic Pages  amd64: 1GB  powerpc: 16GB  Early allocation is required  before power on – ppc  boot-time – x86  Separate pools from regular page allocation and other huge pages The Linux Foundation Confidential The Linux Foundation Confidential 8 8 just an indicator of why we need hstates so badly 8

04/09/09 Multi-size support  Compile time selection?  Gigantic mean no one-size-fits-all approach can possibly work  sysfs interfaces  enumerate/allocate  Permit multiple mounts  Separate allocation pools The Linux Foundation Confidential The Linux Foundation Confidential 9 9 9

04/09/09 Virtualization - KVM  Large memory use  mlock()  Custom app, willing to modify  Performance concerns...  TLB miss 5x cost, with new h/w  Perfect huge page application! The Linux Foundation Confidential The Linux Foundation Confidential 10 10 10

04/09/09 Caveats  Fragmentation  Locked into memory – no reclaim  Hardware must be dedicated  Separate, discrete interfaces  Permissions  Amplification of bad NUMA placement decisions  Architecture TLB weakness The Linux Foundation Confidential The Linux Foundation Confidential 11 11 11

04/09/09 Candidate Users • Large, contiguous memory users • Poor temporal or spatial locality • Bottlenecks on fault speed • Pagetable size overhead  Large shared mapping • Pagetable cache footprint The Linux Foundation Confidential The Linux Foundation Confidential 12 12 Temporal locality – Tendency to re-reference memory – Sparse accesses imply low temporal locality – Use-once (e.g. STREAM) has low locality – Tree elimination solves have higher locality • Spacial locality – Tendency to reference nearby memory – Random access low locality – Cache blocking, higher spacial locality 12

04/09/09 Application Work • Using SHM? Add SHM_HUGETLB • libhugetlbfs  Drop-in replacement for malloc()/shmget()  Works for complex apps like firefox!  Link normally or use LD_PRELOAD  Executables in huge pages  Administraton with hugeadm The Linux Foundation Confidential The Linux Foundation Confidential 13 13 13

04/09/09 Future Work • User Stacks • Transparent promotion/demotion • Continuing improvements in page reclamation • Power management / Memory Hotplug • libhugetlbfs  documentation/usability The Linux Foundation Confidential The Linux Foundation Confidential 14 14 14

04/09/09 Further Reading  Cost of Pagetable lookups in virtual machines:  http://www.amd64.org/fileadmin/user_upload/pub/p26-bhargava.pdf  http://sourceforge.net/projects/libhugetlbfs/  http://www.ibm.com/developerworks/wikis/display/LinuxP/libhugetlbfs+FAQs The Linux Foundation Confidential The Linux Foundation Confidential 15 15 15

Dynamic Large Pages Dave Hansen IBM Linux Technology Center Why - PowerPoint PPT Presentation

Dynamic Large Pages Dave Hansen IBM Linux Technology Center Why Large Pages? Fewer objects to manage Fit more objects in CPU caches Per-page operations become cheaper Per-page structures become smaller Any cache miss is

Linux Under the Hood Manual Pages & Info Pages Distribution Differences Package

Searching Documents and Pages Searching Documents and Pages Searching Documents and Pages Prof.

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

2-Way Sort Problem You have some large number (e.g., 3072) pages of data to sort You only have a

CS101 Lecture 28: Dynamic Web Pages Aaron Stevens 6 April 2009 1 Overview/Questions

06/09/14 10. A (very) short intro to JSP 10. A (very) short intro to JSP Dynamic web pages

1 2 3 TABLE OF CONTENTS Browser, Keyboard, Password Pages 1 2 Adversary Proceedings Pages 3

Dynamic Games & Cartels Johan.Stennek@Economics.gu.se 1 Dynamic Games 2 Dynamic Games

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

Dynamic Motion Simulation ME 24-688 Introduction to CAD/CAE Tools Lecture Topics Dynamic

SAIMENA PRESENTATION DYNAMIC POSITIONING SYSTEMS Introduction to Dynamic Positioning

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms & policies

Dynamic Virtual Clusters in a Grid Dynamic Virtual Clusters in a Grid Site Manager Site Manager

Open, extensible dynamic programming systems or just how deep is the dynamic rabbit hole?

Rate and Orientation Effects in Electrophilic Aromatic Substitution Reactions CH 3 CH 3 CH 3 NO 2

Classicalization, Scrambling and Thermalization in QCD at high energies Raju Venugopalan

Modeling the Underlying Event Modeling the Underlying Event Peter Skands Theoretical Physics,

The effects of birth environment on planetary systems Melvyn B. Davies Department of Astronomy

NOSQL UNDER THE HOOD: THE ANATOMY AND EVOLUTION OF CASSANDRA THE GRADUAL DEVELOPMENT OF

Soft QCD: Theory P e t e r S k a n d s ( C E R N T h e o r e t i c a l P h y s i c s D e p t

Optional Firm Access: Access Pricing Stakeholder Workshop 13 November 2014 (Sydney) / 14

Investments in U.S. REITs & REIT Management May 2014 Corporate Overview HKSE-listed

Dynamic Large Pages Dave Hansen IBM Linux Technology Center Why - PowerPoint PPT Presentation

Dynamic Large Pages Dave Hansen IBM Linux Technology Center Why Large Pages? Fewer objects to manage Fit more objects in CPU caches Per-page operations become cheaper Per-page structures become smaller Any cache miss is

Linux Under the Hood Manual Pages &amp; Info Pages Distribution Differences Package

Searching Documents and Pages Searching Documents and Pages Searching Documents and Pages Prof.

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

2-Way Sort Problem You have some large number (e.g., 3072) pages of data to sort You only have a

CS101 Lecture 28: Dynamic Web Pages Aaron Stevens 6 April 2009 1 Overview/Questions

06/09/14 10. A (very) short intro to JSP 10. A (very) short intro to JSP Dynamic web pages

1 2 3 TABLE OF CONTENTS Browser, Keyboard, Password Pages 1 2 Adversary Proceedings Pages 3

Dynamic Games &amp; Cartels Johan.Stennek@Economics.gu.se 1 Dynamic Games 2 Dynamic Games

Type Systems: Big Idea Static vs. Dynamic Typing Expressiveness (+ Dynamic) Dont have

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

Dynamic Motion Simulation ME 24-688 Introduction to CAD/CAE Tools Lecture Topics Dynamic

SAIMENA PRESENTATION DYNAMIC POSITIONING SYSTEMS Introduction to Dynamic Positioning

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms &amp; policies

Dynamic Virtual Clusters in a Grid Dynamic Virtual Clusters in a Grid Site Manager Site Manager

Open, extensible dynamic programming systems or just how deep is the dynamic rabbit hole?

Rate and Orientation Effects in Electrophilic Aromatic Substitution Reactions CH 3 CH 3 CH 3 NO 2

Classicalization, Scrambling and Thermalization in QCD at high energies Raju Venugopalan

Modeling the Underlying Event Modeling the Underlying Event Peter Skands Theoretical Physics,

The effects of birth environment on planetary systems Melvyn B. Davies Department of Astronomy

NOSQL UNDER THE HOOD: THE ANATOMY AND EVOLUTION OF CASSANDRA THE GRADUAL DEVELOPMENT OF

Soft QCD: Theory P e t e r S k a n d s ( C E R N T h e o r e t i c a l P h y s i c s D e p t

Optional Firm Access: Access Pricing Stakeholder Workshop 13 November 2014 (Sydney) / 14

Investments in U.S. REITs &amp; REIT Management May 2014 Corporate Overview HKSE-listed

Linux Under the Hood Manual Pages & Info Pages Distribution Differences Package

Dynamic Games & Cartels Johan.Stennek@Economics.gu.se 1 Dynamic Games 2 Dynamic Games

Dynamic Memory Allocation Today Dynamic memory allocation mechanisms & policies

Investments in U.S. REITs & REIT Management May 2014 Corporate Overview HKSE-listed