seivarya

A while ago I made a simple bump allocator, built on top of brk() & sbrk(). During this, I found some constraints as to why this sort of linear allocation can be problematic.

Bump Allocator Mechanism

This sort of allocator starts off with a massive empty block and a single pointer at the very beginning. Now, whenever user asks for 10 bytes it simply provides user the pointer and moves the pointer 10 bytes forward. Next up they ask for 20, just return the current address and bumps the pointer 20 bytes forward… and so on.

Now standard allocators such as malloc(size) treat the memory block as structures and these structures have hidden metadata to them which keeps track of the state of memory the block size, state and so on. The main fault in a bump allocator is that, it has no concept of this metadata usage.

This lack of “tracking” of state of each memory block makes the bump allocator insanely fast. (it’s just addition to current address position so that makes the time complexity O(1) regardless of whether it’s the first allocation or the ten thousandth).

Pretty neat right? But this also ensures that there is no concept of free() For individual blocks. You can’t create a hole in the middle of linear allocated space the only thing you can do is just reset the main pointer all the way back to zero. This means one bump_allocator_free() call and the entire arena gets wiped out. Not being able to free() individual chunks leads to a specific type of memory degradation. We call it Temporal Fragmentation.

Some people might be fine with it but this “global” state is really dangerous. The primary reason being is that sbrk() is a global resource. Suppose you are using sbrk() to “bump” the allocator, and then later you call a stdlib method such as printf() or std::vector, the program will crash and segment fault.

Standard library functions like printf(), std::vector, and other runtime utilities allocate memory internally through malloc(). The issue is that malloc() manages the same underlying process heap region that sbrk() manipulates. Since sbrk() operates on a process wide global heap boundary, directly using it inside a custom allocator means you are modifying memory state that the malloc() also relies on. The danger is not that both allocators necessarily receive the exact same pointer, but that both begin making independent assumptions about ownership and layout of the same heap region. If your allocator later rewinds or resets that region, it may destroy memory still considered valid by malloc(), leading to heap corruption or segmentation fault.

You get mem_a from sbrk() and the malloc() call after that moves it to mem_x position. Now if you tried to perform that “heap” wipe, the program tries to free the mem_x which is under malloc()’s allocation and not yours! Leading to a segmentation fault or memory corruption (which is much worse).

So we’ve established that in a perfect universe a bump allocator can work but the flow will be:

[allocate mem_x] --> [allocate mem_y] --> [use all of them together] --> [destroy all of them together]

[allocate mem_x] --> [allocate mem_y] --> [free mem_x] --> [allocate mem_z] --> [free mem_y] --> [allocate mem_w] --> [free mem_z]

The Arena Solution

To fix this sbrk() issue, we can use arena allocators. Instead of moving the heap break of the whole process directly, we allocate a large memory region at once using malloc() or mmap(), and then treat that block as a local “sandbox” arena for linear allocations.

the limitations of bump allocator

Bump Allocator Mechanism

The Arena Solution