Juggling Memory: Arenas in Golang

Greetings, in a highly loaded environment, large allocation often strongly affects the processing speed of a particular part of the service, in order to more finely control memory, arenas have appeared. How do they turn on? It’s simple, you need the GO…


This content originally appeared on DEV Community and was authored by Vladislav Semenkov

Greetings, in a highly loaded environment, large allocation often strongly affects the processing speed of a particular part of the service, in order to more finely control memory, arenas have appeared. How do they turn on? It's simple, you need the GOEXPERIMENT=arenas flag. Let's take an example of how memory works with "small objects".

The route of the "small object"

The so-called "tiny allocator" is responsible for "small objects" in Golang. How does it work?

Image description

In golang, all small objects go through the path: tiny allocator -> cache -> span pool (with tiny Span Class) - heap.

A small object, in turn, is considered an object that is less than 16 bytes, if you look at the source code, you can see why you chose such a constant:

// Size of the memory block used for combining (maxTinySize) is tunable.
// Current setting is 16 bytes, which relates to 2x worst case memory
// wastage (when all but one subobjects are unreachable).
// 8 bytes would result in no wastage at all, but provides less
// opportunities for combining.
// 32 bytes provides more opportunities for combining,
// but can lead to 4x worst case wastage.
// The best case winning is 8x regardless of block size.

Everything is done primarily for optimization.

All these actions are inside a single P, so the allocation of a small object is very insignificant. The GC intervenes only twice per cycle: stop the world, at the beginning and end of the marking (termination), and a concurrent mark runs between them. Simply put, an automatic mark and sweep is used. In turn, in the arena, many objects are poured into one span and given to runtime in one operation, our GC does not interfere and we ourselves have to monitor the lifecycle.

Arena objects do not participate in tri-color until Free, where only the pointer is shifted without global structures inside the allocated span

Span

Earlier in the article, I mentioned "Span", what is it in the classical sense of Go?

Span is a continuous sequence of pages, a minimal block that the allocation and GC operators operate on inside the heap.

When there are no free cells of the required size in the local mcache, it takes a completely free span from the mcentral and divides it with a bump pointer for objects.

mcentral, when it no longer has enough memory, requests a new span from the global page manager mheap. It, in turn, reserves N pages in the page heap.

We can discuss this topic in more detail in the following articles, but we won't dwell on it for now.

The first steps

First, let's turn on our arenas:

 export GOEXPERIMENT=arenas

Preface: for large objects, go allocates its own mspan (set of pages) and returns come from the global mheap. Let's take an example using bench + arenas:

func nsPerAlloc(b *testing.B) {
    b.ReportMetric(float64(b.Elapsed().Nanoseconds())/M, "ns/alloc")
}

type Big [1 << 20]byte
const M = 10_000

func BenchmarkHeapBig(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buffers := make([]*Big, M)

        for j := 0; j < M; j++ {
            buf := new(Big)
            buf[0] = byte(j)
            buffers[j] = buf
        }
    }

    nsPerAlloc(b)
}

func BenchmarkArenaBig(b *testing.B) {
    for i := 0; i < b.N; i++ {
        a := arena.NewArena()
        buffers := make([]*Big, M)

        for j := 0; j < M; j++ {
            buf := arena.New[Big](a)
            buf[0] = byte(j)
            buffers[j] = buf
        }

        a.Free()
    }

    nsPerAlloc(b)
}
goos: darwin
goarch: arm64
pkg: a
cpu: Apple M3 Pro
BenchmarkHeapBig-11                 28        1843796832 ns/op           5162629 ns/alloc     10485842133 B/op           10001 allocs/op
BenchmarkArenaBig-11                 1        1314971541 ns/op            131497 ns/alloc     11800062248 B/op            1445 allocs/op
PASS
ok      a       53.578s

Why is that?

1) each large object in runtime follows the path of a large allocation with span allocation, 10k allocations - 10k sys calls
2) the arena, in turn, reserves a large span and small allocations for slice headers

How to work with arenas?

Here is a sample code that covers the basic API for working with them:

type Point struct {
    x, y int
}

const N = 1_000

func main() {
    a := arena.NewArena() // creating
    defer a.Free() // release

    p := arena.New[Point](a) // allocation to the structure
    p.x, p.y = 10, 20

    points := arena.MakeSlice[Point](a, 0, 100) // a slice inside the same arena

    for i := 0; i < 10; i++ {
        pt := arena.New[Point](a)
        pt.x, pt.y = i, i*i

        points = append(points, *pt)
    }

    heapPoints := arena.Clone(points) // now the whole slice is in heap

    fmt.Println("first =", heapPoints, "len =", len(heapPoints))
}
go run tiny.go
first = [{0 0} {1 1} {2 4} {3 9} {4 16} {5 25} {6 36} {7 49} {8 64} {9 81}] len = 10

The arenas inside

The arena itself stores a pointer inside itself

type Arena struct {
  a unsafe.Pointer
}

In turn, NewArena calls the runtime_arena_newArena() unsafe.Pointer. Which allocates one shared mspan across a shared heap.

How does the GC see the arena?

  • the peculiarity of the arena, as I noted above, is that the GC does not scan the arena and does not allow objects to it, because it marks it as noscan.

After Free, the GC starts viewing the arena as an object and the arena itself is marked as a zombie

  • the sweep-worker, after the mark-term, translates span to idle and memory can be transferred to a regular heap, reuse just happens after the next GC cycle.

There is another feature related to the fact that it has a finalizer, which is needed if the developer does not call Free, then runtime will catch it.

arena.New[T](a)

  • the runtime_arena_arena_New call gets the type via reflect.Type

It takes into account alignment, copies the zero-word and simply shifts the bump pointer.

arena.MakeSlice[T](a,len,cap)

  • reserves the backing array with the same bump pointer, but places the slice header in the regular heap

  • to optimize GC inside the chunk: Pointer-full objects grow from bottom to top, Pointer-free objects grow from top to bottom. This gives runtime the right to finish scanning earlier and skip clearing the bitmap for clean memory areas.

Image description

Clone

Copies the type to a regular heap via runtime_arena_heapify, zeroing out all connections to the arena.

As a result:

  • data lives arbitrarily long.

  • the GC starts scanning them as an ordinary object.

  • it is also worth considering that if Clone is forgotten or the object is being used after Free, the program crashes with SIGSEGV, which makes it easy for us to track such frauds with the past address space.

When to use the arenas?

  • when you need to process large data in one request, for example, a large short-lived buffer and you need to release it all at once.

  • large objects of different sizes, so as not to generate sync.Pool, and as we know, the GC can release them at the wrong moment ;)

  • if we use arenas, it's best to use the build tag, because they may be removed in the future, as written in the comments to the source code themselves.

When not to use the arenas?

  • multiple fixed buffers, sync.Pool will do it faster, the bench will show it very well

  • objects live longer than the request

  • and most importantly, the arena is not thread-safe, so it's better to limit it to running it in a single goroutine.

That's it:(

I hope you enjoyed this article, if you have any suggestions or improvements on the article, write! I will be grateful!


This content originally appeared on DEV Community and was authored by Vladislav Semenkov


Print Share Comment Cite Upload Translate Updates
APA

Vladislav Semenkov | Sciencx (2025-06-28T23:48:58+00:00) Juggling Memory: Arenas in Golang. Retrieved from https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/

MLA
" » Juggling Memory: Arenas in Golang." Vladislav Semenkov | Sciencx - Saturday June 28, 2025, https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/
HARVARD
Vladislav Semenkov | Sciencx Saturday June 28, 2025 » Juggling Memory: Arenas in Golang., viewed ,<https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/>
VANCOUVER
Vladislav Semenkov | Sciencx - » Juggling Memory: Arenas in Golang. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/
CHICAGO
" » Juggling Memory: Arenas in Golang." Vladislav Semenkov | Sciencx - Accessed . https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/
IEEE
" » Juggling Memory: Arenas in Golang." Vladislav Semenkov | Sciencx [Online]. Available: https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/. [Accessed: ]
rf:citation
» Juggling Memory: Arenas in Golang | Vladislav Semenkov | Sciencx | https://www.scien.cx/2025/06/28/juggling-memory-arenas-in-golang/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.