Spring 2017 :: CSE 506
Dynamic Memory Allocation
Nima Honarmand
Dynamic Memory Allocation Nima Honarmand Spring 2017 :: CSE 506 - - PowerPoint PPT Presentation
Spring 2017 :: CSE 506 Dynamic Memory Allocation Nima Honarmand Spring 2017 :: CSE 506 Lecture Goals Understand how dynamic memory allocators work In both kernel and applications Understand trade-offs and current best practices
Spring 2017 :: CSE 506
Nima Honarmand
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
and then sub-divides them for the application
kmalloc()/kfree()?
between memory requests in the kernel
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
User Kernel
Process 1 Dynamic Memory Allocator Rest of the Application
malloc() free()
PFM (Page Frame Manager) Process 2 Process n Dynamic Memory Allocator Rest of the Kernel
page_alloc() page_free() page_alloc() page_free() brk(), mmap() page faults
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
programs
something else
→ Need more complex allocators
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
happen?
too small to be useful
Spring 2017 :: CSE 506
coalescing
Spring 2017 :: CSE 506
sz + sizeof(header)
int size; // other data int magic;
Allocated object
int size; void *next;
Free object Returned pointer: Return value
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
same-size objects
to the next available object size, and allocate from the corresponding pool
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
1. Waiting for locks: contended locks cause serialized execution
time
2. lock/unlock is pure overhead, even when uncontended
heaps
distributing the threads across them
Spring 2017 :: CSE 506
between deallocation and reallocation
allocated)
few cycles to load data into cache from memory
in our cache
Spring 2017 :: CSE 506
deallocated on one processor and allocated on another
the problem
migration (moving threads between processors)
Spring 2017 :: CSE 506
Let’s put these good ideas to work
Spring 2017 :: CSE 506
“powers of b > 1”;
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
256 byte
4 KB page (Free space) 4 KB page
next next next next next next Free next
Free list in LIFO order Each page an array of
Store list pointers in free objects!
Spring 2017 :: CSE 506
the per-CPU heap
Spring 2017 :: CSE 506
list
from?
→ Hoard doesn’t need to keep per-object meta-data header
Spring 2017 :: CSE 506
superblock, doesn’t that yield internal fragmentation?
Spring 2017 :: CSE 506
superblock, just mmap() it
Spring 2017 :: CSE 506
pretty straightforward
per-object meta-data)
metadata from there
Spring 2017 :: CSE 506
Object foo (CPU 0 writes) Object bar (CPU 1 writes)
Cache line
Spring 2017 :: CSE 506
than linear in the number of CPUs is probably caused by cache behavior
Spring 2017 :: CSE 506
extreme
Spring 2017 :: CSE 506
CPU where they were allocated
synchronization: some allocators do this
it is not the programmer’s job to avoid false sharing
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
Three types of dynamic allocators in Linux:
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
allocator
superblock in Hoard
bookkeeping
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
Spring 2017 :: CSE 506
business
scalability, etc.