Recently, I discovered that the Go standard library includes a built-in implementation of varint, found in encoding/binary/varint.go. This implementation is similar to the varint used in protobuf. Using the Golang standard library’s varint source code, we will systematically learn and review the concept of varint.
If you’re familiar with protobuf, you probably already know that all integer types (except fixed types like fixed32 and fixed64) are encoded using varint.
varint mainly solves two issues:
-
Space Efficiency: Take
uint64as an example, representing values as large as 18,446,744,073,709,551,615. In most real-world scenarios, however, our integer values are much smaller. If your system needs to process values as low as 1, you’d still use8 bytesto represent this value in transmission, wasting space since most bytes store no useful data. varint encoding uses a variable-length byte sequence to represent integers, reducing the space required for smaller values. -
Compatibility:
varintallows us to handle integers of different sizes without altering the encoding/decoding logic. This means fields can be upgraded from smaller types (likeuint32) to larger ones (likeuint64) without breaking backward compatibility.
This article will dive into Golang varint implementation, exploring its design principles and how it addresses the challenges of encoding negative numbers.
This article was first published under the Medium MPP plan. Follow me on Medium if you’re a Medium user.
The Design Principles of varint
varint is designed based on simple principles:
- 7-bit Grouping: The binary representation of an integer is divided into 7-bit groups. From the least significant bit to the most significant bit, every 7-bit group becomes a unit.
- Continuation Bit: A flag bit is added before each 7-bit group, forming an 8-bit byte. If more bytes follow, the flag bit is set to 1; otherwise, it’s set to 0.
For example, the integeruint64(300)has a binary representation of100101100. Dividing this into two groups—10and0101100—and adding flag bits results in two bytes:00000010and10101100, which is thevarintencoding of 300. Compared touint64, which uses 4 bytes,varintreduces the storage by 75%.
list1: uint64 tovarint
|
|

varint for Unsigned Integers
The Go standard library provides two sets of varint functions: one for unsigned integers (PutUvarint, Uvarint) and another for signed integers (varint, Putvarint).
Let’s first look at the unsigned integer varint implementation:
list2: go src PutUvarint
|
|
There is a very important constant in the code: 0x80, which corresponds to the binary code 1000 0000. This constant is very important for the logic that follows:
x >= 0x80: This checks ifxrequires more than 7 bits for representation. If it does,xneeds to be split.byte(x) | 0x80: This applies a bitwise OR with0x80(1000 0000), ensuring the highest bit is set to 1 and extracting the lowest 7 bits ofx.x >>= 7: Shiftxright by 7 bits to process the next group.buf[i] = byte(x): When the loop ends, the highest bits are all zeros, so no further action is needed.
Uvarint is the reverse of PutUvarint.
It should be noted that: varint splits integers into 7-bit groups, meaning large integers may face inefficiencies. For example, uint64’s maximum value requires 10 bytes instead of the usual 8 (64/7 ≈ 10).
Encoding Negative Numbers: Zigzag Encoding
Though varint is efficient, it doesn’t account for negative numbers. In computing, numbers are stored as two’s complement, which means a small negative number might have a sizeable binary representation.
For example, -5 in 32-bit form is represented as 11111111111111111111111111111011, requiring 5 bytes in varint encoding
Go uses zigzag encoding to solve this problem:
- For positive numbers
n, map them to2n. - For negative numbers
-n, map them to2n-1.
This way, positive and negative numbers alternate without conflict, hence the namezigzag encoding.
For example, after zigzag encodingint32(-5), the value becomes 9 (00000000000000000000000000001001), whichvarintcan represent with just 1 byte.
Here’s the Golang implementation:
list3: go src Putvarint
|
|
From the code, we can see that for the implementation of varint for signed integers, the Go standard library breaks it down into two steps:
- First, the integer is converted using
ZigZag encoding. - Then, the converted value is encoded using
varint.
For negative numbers, there is an extra step: ux = ^ux. This part might be confusing—why does this transformation result in 2n - 1?
We can roughly deduce the process, assuming we have an integer -n:
- First, the original value is shifted left, then inverted. This can be viewed as: first invert the value, then shift left, and finally add 1. This results in
2*(~(-n)) + 1. - the two’s complement of a negative number is the bitwise inversion of its absolute value plus 1. So, how do we derive the absolute value from the two’s complement? There is a formula:
|A| = ~A + 1. - Substituting this formula into the first step:
2*(n - 1) + 1 = 2n - 1. This perfectly matches the ZigZag encoding for negative numbers (mathematics is indeed excellent).

In the Go standard library, calling PutUvarint only applies varint encoding, while calling PutVarint first applies ZigZag encoding and then varint encoding.
In protobuf, if the type is int32, int64, uint32, or uint64, only varint encoding is used. However, for sint32 and sint64, ZigZag encoding is applied first, followed by varint encoding.
When varint Is Not Suitable
Despite its benefits, varint isn’t ideal for all scenarios:
- Large integers:
varintcan be less efficient than fixed-length encoding for huge numbers. - Random data access: Since
varintuses variable lengths, indexing specific integers directly is challenging. - Frequent mathematical operations:
varint-encodingdata requires decoding before operations, potentially affecting performance. - Security-sensitive applications:
varint encodingmay leak information about the original integer’s size, which could be unacceptable in secure environments.