Learning About Linux Memory Management (MM) Through Pictures (part 3): allocation
This is a multi-part blog post series.
Part 1 was based on Gustavo Duarte’s blog posts on Anatomy of a Program in Memory and How the Kernel Manages Your Memory. Part 2 talks about different memory caches, based on Gustavo Duarte’s post on Page Cache, the Affair Between Memory and Files. Part 3 is an high level description of memory allocation.
References
figures: UnderstandingLinuxEd3, 2005
www.livegrep.com (the best!)
Memory Allocation
A user application might request memory from the kernel. Or, the kernel might need to allocate memory to itself, to set up some data structure (like a vma for a process). The kernel allocates memory in units of the ‘page’. For a vma (vm_area_struct), for example, it searches for consecutive page frames that are at least as large as the vma size. Where does the kernel look?
The Allocators:
The Zoned Page Frame Allocator
There are a hierarchy of allocators. The top level of the hierarchy is the Zoned Page Frame Allocator (ZPFA), shown in the Figure: Zoned Page Frame Allocator. It iterates through each zone, looking for enough contiguous page frames to satisfy a kernel request (alloc_pages). If the kernel only needs to allocate a single page frame, then, within each zone, it iterates through each cpu, looking in the cpu caches for a single page frame. If it finds one, great. Otherwise, the kernel requests the page frames from the buddy system in that zone. If the kernel succeeds, great. Otherwise, it continues to the next zone. If it does not succeed, then the kernel, if it is able to, triggers the subsystem to reclaim page frames. After page frames are reclaimed, the kernel continues to search for free page frames to allocate.
Figure: Zoned Page Frame Allocator
The Slab Allocator
The Slab Allocator requests page frames from the ZPFA. It stores commonly-used objects, like vm_area_structS, that can be used and re-used by the kernel, rather than having the system reclaim and re-allocate the page frames, and re-init the vma objects. The Slab Allocator is called ‘Slab’ because its page frames are organized into ‘slabs’. Each slab stores objects of the same type. An object store is called a ‘cache’. Figure: Slab Allocator puts all the pieces together.
Figure: Slab Allocator
The kernel represents a cache and a slab as a descriptor data type. Figure Caches+Slabs shows how slabs are organized into caches.
Empty slab: no page frames (and thereby no objects)
Full slab: no free objects (all objects are in-use)
Partially-full slab: has some free (i.e., unused) objects that are ready to be re-used
Note: The Slab Allocator’s ‘free’ object is not free in the same way that memory is free in the ZPFA. A ‘free’ page in the ZPFA is unallocated. A ‘free’ page in the Slab Allocator is allocated to it by the ZPFA but contains no objects. A free object in the Slab Allocator is not currently in-use and is free (ready) to be re-used. Only after the Slab Allocator releases a page frame to the ZPFA (’released to the buddy system’) and the page frame is on the buddy system’s free list, is the page frame truly free (unallocated)
Figure: Caches+Slabs
Kernelshark Visualization of Memory Allocator in Action
The kernelshark output shows the kernel trying to allocate memory for a mm_struct data type.
Figure: mm_alloc with slab and zone allocation
# allocate an address space object (mm_struct) from a cache
mm_alloc -> kmem_cache_alloc
# initialize the mm_struct object
mm_init -> pgd_alloc -> __get_free_pages
/* The kernel must allocate memory for the page table, which it does by requesting free page frames from the ZPFA. The kernel iterates through each zone. For each zone, it looks for enough contiguous pages from the free list to satisfy the request. */
get_page_from_freelist -> __rmqueue
__rmqueue is the request to the buddy system of the ZPFA for a single page frame. It is repeated as many times as necessary to all all requested page frames.
Figure below shows the kernel can request objects from the Slab Allocator or page frames from the ZPFA. A user process has access to the ZPFA but not to the Slab allocator.
When necessary, the Slab Allocator requests pages from the ZPFA.
When pages cannot be allocated, pages must be reclaimed.
The kernel keeps a reserved memory pool so it can satisfy requests that cannot be blocked, and therefore, cannot wait for the page reclaim process.














