Linux process address space layout Views

Mian Qin · August 27, 2020

In this post, I will demonstrate the linux process address space layout, which is also a popular interview questions that can be digged deeply. Here we focus on modern x86_64 rather than 32 bits legacy x86 mode.

Process memory layout

For each process in linux, it has user-space virtual memory and kernel-space virtual memory. For all the process, the kernel-space virtual memory is shared (identical). This is done through sharing half of the page directory entry in the page table directory.

kernel space layout

Linux document gives details description of how kernel-space virtual memory is laied out as below. An important region is the direct mapping region which mapps the entire physical address to virtual address continously (in 32bit mode there are only ~800MB are mapped). This can be done efficiently through large page mapping, say 1GB large page. Doing so, the physical address is easily accessed through MMU (by adding an offset to the physical address we get virtual address). This is important to modify some kernel structure like page table.

========================================================================================================================
    Start addr    |   Offset   |     End addr     |  Size   | VM area description
========================================================================================================================
                  |            |                  |         |
 0000000000000000 |    0       | 00007fffffffffff |  128 TB | user-space virtual memory, different per mm
__________________|____________|__________________|_________|___________________________________________________________
                  |            |                  |         |
 0000800000000000 | +128    TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
                  |            |                  |         |     virtual memory addresses up to the -128 TB
                  |            |                  |         |     starting offset of kernel mappings.
__________________|____________|__________________|_________|___________________________________________________________
                                                            |
                                                            | Kernel-space virtual memory, shared between all processes:
____________________________________________________________|___________________________________________________________
                  |            |                  |         |
 ffff800000000000 | -128    TB | ffff87ffffffffff |    8 TB | ... guard hole, also reserved for hypervisor
 ffff880000000000 | -120    TB | ffff887fffffffff |  0.5 TB | LDT remap for PTI
 ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)
 ffffc88000000000 |  -55.5  TB | ffffc8ffffffffff |  0.5 TB | ... unused hole
 ffffc90000000000 |  -55    TB | ffffe8ffffffffff |   32 TB | vmalloc/ioremap space (vmalloc_base)
 ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
 ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
 ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
 ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
__________________|____________|__________________|_________|
                  |            |                  |         |
 fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
                  |            |                  |         | vaddr_end for KASLR
 fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
 fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
 ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
 ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
 ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
 ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
 ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
 ffffffff80000000 |-2048    MB |                  |         |
 ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
 ffffffffff000000 |  -16    MB |                  |         |
    FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
 ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
 ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
__________________|____________|__________________|_________|___________________________________________________________

user space layout

User space memory layout is as follows. The text, data, bss is loaded from ELF binary during exec system call to initlize the process. The heap area is growing upwards and stack area is growing downwards.

User Stack
    |
    v
Memory Mapped Region for Shared Libraries or Anything Else
    ^
    |
Heap
Uninitialised Data (.bss)
Initialised Data (.data)
Program Text (.text)
0000000000000000

For stack, things like call stack and local variables are stored there. It is managed through instruction level by manipulating the frame pointer register (ebp), stack pointer registe (esp) and push/pop/call/ret instructions. If the address is beyond the mapped address in the page table, a page fault should ocurr and a frame will be allocated and mapped to the page table for the stack area. For heap management, it’s usually managed through system call brk and it’s familiy. A lot of cases when allocating memory through malloc, memory is allocated through mmap which will be mapped to the memory mapped region between stack and heap region.

How kernel manage memory region

Kernel manages the memory regions through the vm_area_struct in the process description struct. It can be structured using linked list or tree for easy access. The vm_area_struct is checked during protection fault and modified when adding/deleting memory regions (like a memory mapped area).


Twitter, Facebook