All examples in this article use a MacBook Pro M1, a 64-bit architecture CPU.
This is the third article on high-performance programming in Go, analyzing why memory alignment is needed, the rules of Go memory alignment, and practical examples of memory alignment usage. Finally, it shares two tools to help us identify memory alignment issues during development.
This article was first published in the Medium MPP plan. If you are a Medium user, please follow me on Medium. Thank you very much.
What is Memory Alignment?
To a programmer, memory might just be a huge array. We can write an int16
, which occupies two bytes, or an int32
, which occupies four bytes. For example:
|
|
Those unfamiliar with Go might think the structure is laid out like this, taking up a total of 11 bytes of space.
Figure 1: Memory layout as understood by some people
One after another, very compact and perfect. But in reality, it’s not like this. If we print the addresses of T1 variables, we’ll find they look something like this, occupying a total of 24 bytes of space.
Figure 2: Actual memory layout of T1
List 1: T1 size
|
|
The CPU fetches data from memory based on word size. For example, a 64-bit CPU has a word size of 8 bytes, meaning the CPU accesses memory in 8-byte units, referred to as memory access granularity
.
This phenomenon can cause several serious problems:
- Performance degradation due to an extra CPU instruction.
- What was originally an atomic operation for reading a variable is no longer atomic.
- Other unexpected situations.
Therefore, compilers generally implement memory alignment, sacrificing memory space to ensure:
- Platform Compatibility: Not all hardware platforms can access arbitrary data at arbitrary addresses. For example, specific hardware platforms only allow fetching specific types of data at specific addresses, otherwise leading to exceptions.
- Performance: Accessing unaligned memory causes the CPU to perform two memory accesses and spend extra clock cycles handling alignment and computation. Aligned memory can be accessed in a single operation, improving efficiency—a typical space-for-time tradeoff.
Memory Alignment in Go
The Go spec stipulates Go’s alignment rules.
|
|
- For a variable
x
of any type:unsafe.Alignof(x)
is at least 1.- For a variable
x
of struct type:unsafe.Alignof(x)
is the largest of all the valuesunsafe.Alignof(x.f)
for each fieldf
ofx
, but at least 1.- For a variable
x
of array type:unsafe.Alignof(x)
is the same as the alignment of a variable of the array’s element type.
In most cases, the Go compiler automatically aligns memory for us, and we don’t need to worry about it. However, in one particular case, manual alignment is required.
For 64-bit pointer atomic operations on the x86 platform, alignment is mandatory because 64-bit atomic operations on a 32-bit platform require 8-byte alignment, or the program will panic. For example, consider the following code:
|
|
Running this on the amd64 architecture won’t cause an error, but on the i386 architecture, it will panic.
Figure 3: T3 panic
The reason is that T3 is 4-byte aligned on a 32-bit platform and 8-byte aligned on a 64-bit platform. On a 64-bit platform, its memory layout is:
Figure 4: T3 memory layout on amd64
But on the i386 layout:
Figure 5: T3 memory layout on i386
This issue is documented in the atomic package.
- On 386, the 64-bit functions use instructions unavailable before the Pentium MMX.
On non-Linux ARM, the 64-bit functions use instructions unavailable before the ARMv6k core.
On ARM, 386, and 32-bit MIPS, it is the caller’s responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically via the primitive atomic functions (types Int64 and Uint64 are automatically aligned). The first word in an allocated struct, array, or slice; in a global variable; or in a local variable (because the subject of all atomic operations will escape to the heap) can be relied upon to be 64-bit aligned.
To resolve this, we must manually pad T3 to make it “look” 8-byte aligned:
|
|
Similar operations can be seen in the Go source code and open-source libraries, such as:
Fortunately, we have many tools to help identify and optimize these issues.
Practical Engineering
fieldalignment
fieldalignment is an official Go tool that helps us identify potential memory alignment optimizations in code and automatically aligns them. For example, it will automatically convert T1
to be memory-aligned.
|
|
It can also be used in golangci-lint. fieldalignment is a sub-function of govet, enabled in .golangci.yaml as follows:
|
|
However, fieldalignment has a frustrating drawback: it removes all blank lines and comments when rearranging struct members. Therefore, you should git commit once, use this tool, then review its changes via git diff and make necessary post-processing adjustments. Thus, I rarely use this tool in production, preferring structlayout
.
structlayout
structlayout
displays the layout and size of structs and can output data in svg or json formats. If a struct is complex, this tool can help optimize it.
Visualize and Optimize Go Struct Layout with structlayout
structlayout
allows you to display the layout and size of structs, outputting data in SVG or JSON format. If a struct is complex, this tool can be used to optimize it.
Installation
|
|
Analyze T1 with structlayout
|
|
Figure 6: T1 Structure Layout
We can clearly see two padding areas: 7 size and 6 size.
Optimized T2
|
|
Figure 7: T2 Structure Layout
There are still two padding areas, but only 5 sizes.
Summary
In programming, memory alignment is a crucial technique designed to enhance program performance and compatibility. This article uses Go as an example to explain the basic concepts and necessity of memory alignment in detail, demonstrating the actual layout of different structs in memory through code examples.
Memory alignment rules in Go are primarily reflected in the order of struct fields. The compiler ensures performance and platform portability through automatic alignment, but in some cases, developers need to manually adjust struct fields to avoid performance issues and potential errors.
The empty struct is a helpful tool for memory alignment optimization. For specific operations, refer to my other article: Golang High-Performance Programming EP1: Empty Struct.
To help developers detect and optimize memory alignment issues, this article introduces two practical tools:
fieldalignment
: An official Go tool that can automatically optimize struct memory alignment.structlayout
: Displays the memory layout of structs, helping developers understand and optimize memory usage more intuitively.
By using these tools effectively, developers can reduce memory waste and improve development efficiency while ensuring program performance and stability.