Compare the performance, advantages, and disadvantages of fastjson
, gjson
, and jsonparser
This article delves into the analysis of how the standard library in Go parses JSON and then explores popular JSON parsing libraries, their characteristics, and how they can better assist us in development in different scenarios.
This article is first published in the medium MPP plan. If you are a medium user, please follow me on the medium. Thank you very much.
I didn’t plan to look into the JSON library’s performance issue. However, recently, I did a pprof on my project and found from the flame graph below that more than half of the performance consumption in business logic processing is during JSON parsing. Therefore, this article came about.
This article delves into the analysis of how the standard library in Go parses JSON and then explores popular JSON parsing libraries, as well as their characteristics and how they can better assist us in development in different scenarios.
Mainly introduce the analysis of the following libraries (2024-06-13):
lib | Star |
---|---|
JSON Unmarshal | |
valyala/fastjson | 2.2 k |
tidwall/gjson | 13.8 k |
buger/jsonparser | 5.4 k |
JSON Unmarshal
|
|
“The official JSON parsing library requires two parameters: the object to be serialized and the type of this object. Before actually performing JSON parsing, reflect.ValueOf
is called to obtain the reflection object of parameter v
. Then, the method for parsing is determined based on the non-empty characters at the beginning of the incoming data object.”
|
|
If the parsed object starts with [
, it indicates that this is an array object and will enter the scanBeginArray branch; if it starts with {
, it indicates that the parsed object is a struct or map, and then enters the scanBeginObject branch, and so on.
Sub Summary
Looking at Unmarshal’s source code, it can be seen that a large amount of reflection is used to obtain field values. If the JSON is nested, recursive reflection is needed to obtain values. Thus, the performance can be imagined to be very poor.
However, if performance is not highly valued, using it directly is actually a very good choice. It has complete functionality, and the official team has been continuously iterating and optimizing it. Maybe its performance will also make a qualitative leap in future versions. It should be the only one that can directly convert JSON objects into Go structs.
fastjson
The characteristic of this library is fast, just like its name suggests. Its introduction page says so:
Fast. As usual, up to 15x faster than the standard encoding/json.
Its usage is also very simple, as follows:
|
|
To use fastjson, first, give the JSON string to the Parser parser for parsing, and then retrieve it through the object returned by the Parse method. If it is a nested object, you can directly pass in the corresponding parent-child key when passing parameters to the Get method.
Analysis
The design of fastjson
differs from the standard library Unmarshal
in that it divides JSON parsing into two parts: Parse and Get.
Parse is responsible for parsing the JSON string into a structure and returning it. Data is then retrieved from the returned structure. The Parse process is lock-free, so if you want to call Parse concurrently, you need to use ParserPool.
fastjson
processes JSON by traversing it from top to bottom, storing the parsed data in a Value
structure:
|
|
This structure is very simple:
o Object
: Indicates that the parsed structure is an object.a []*Value
: Indicates that the parsed structure is an array.s string
: If the parsed structure is neither an object nor an array, other types of values are stored in this field as a string.t Type
: Represents the type of this structure, which can be TypeObject, TypeArray, TypeString, TypeNumber, etc.
|
|
This structure stores the recursive structure of objects. After parsing the JSON string in the example above, the resulting structure looks like this:
Code
In terms of implementation, the absence of reflection code makes the entire parsing process very clean. Let’s directly look at the main part of the parsing:
|
|
parseValue will determine the type to be parsed based on the first non-empty character of the string. Here, an object type is used for parsing:
|
|
The parseObject function is also very simple. It will get the key value in the loop, and then call the parseValue function recursively to parse the value from top to bottom, parsing JSON objects one by one until encountering }
at last.
Sub Summary
Through the above analysis, it can be seen that fastjson
is much simpler in implementation and has higher performance than the standard library. After using Parse to parse the JSON tree, it can be reused multiple times, avoiding the need for repeated parsing and improving performance.
However, its functionality is very rudimentary and lacks common operations such as JSON to struct or JSON to map conversion. If you only want to simply retrieve values from JSON, then using this library is very convenient. But if you want to convert JSON values into a structure, you will need to manually set each value yourself.
GJSON
In my test, although the performance of GJSON is not as extreme as fastjson, its functionality is very complete and its performance is also quite OK. Next, let me briefly introduce the functionality of GJSON.
The usage of GJSON is similar to fastjson, it is also very simple. Just pass in the JSON string and the value that needs to be obtained as parameters.
|
|
In addition to this function, simple fuzzy matching can also be performed. It supports wildcard characters *
and ?
in the key. *
matches any number of characters, while ?
matches a single character, as follows:
|
|
child*.2
: First,child*
matcheschildren
,.2
reads the third element;c?ildren.0
:c?ildren
matcheschildren
,.0
reads the first element;
In addition to fuzzy matching, it also supports modifier operations.
|
|
children|@reverse
先读取数组children
,然后使用修饰符@reverse
翻转之后返回,输出。
|
|
@flatten
flattens the inner array of array nested
to the outer array and returns:
|
|
There are some other interesting features, you can check the official documentation.
Analysis
The Get method parameter of GJSON is composed of two parts, one is a JSON string, and the other is called Path, which represents the matching path of the JSON value to be obtained.
In GJSON, because it needs to meet many definitions of parsing scenarios, the parsing is divided into two parts. You need to parse the Path before traversing the JSON string.
If you encounter a value that can be matched during the parsing process, it will be returned directly, and there is no need to continue to traverse down. If multiple values are matched, the whole JSON string will be traversed all the time. If you encounter a Path that cannot be matched in the JSON string, you also need to traverse the complete JSON string.
In the process of parsing, the content of parsing will not be saved in a structure like fastjson, which can be used repeatedly. So when you call GetMany to return multiple values, you actually need to traverse the JSON string many times, so the efficiency will be relatively low.
It’s important to be aware that when using the @flatten function to parse JSON, it won’t be validated. This means that even if the input string is not a valid JSON, it will still be parsed. Therefore, it’s essential for users to double-check that the input is indeed a valid JSON to avoid any potential issues.
Code
|
|
In the Get method, you can see a long code string used to parse various paths. Then, a for loop continuously traverses JSON until it finds ‘{’ or ‘[’ before performing the corresponding logic processing.
|
|
In reviewing the parseObject
code, the intention was not to teach JSON parsing or string traversal but to illustrate a bad-case scenario. The nested for
loops and consecutive if
statements can be overwhelming and may remind you of a colleague’s code you’ve encountered at work.
Sub Summary
Advantages:
- Performance:
jsonparser
performs relatively well compared to the standard library. - Flexibility: It offers various retrieval methods and customizable return values, making it very convenient.
Disadvantages:
- No JSON Validation: It does not check for the correctness of the JSON input.
- Code Smell: The code structure is cumbersome and hard to read, which can make maintenance challenging.
Note
When parsing JSON to retrieve values, the GetMany
function will traverse the JSON string multiple times based on the specified keys. Converting the JSON to a map can reduce the number of traversals.
Conclusion
While jsonparser
has notable performance and flexibility, its lack of JSON validation and complex, hard-to-read code structure present significant drawbacks. If you need to parse JSON and retrieve values frequently, consider the trade-offs between performance and code maintainability.
jsonparser
Analysis
jsonparser
also processes an input JSON byte slice and allows for quickly locating and returning values by passing multiple keys.
Similar to GJSON, jsonparser
does not cache the parsed JSON string in a data structure as fastjson
does. However, when multiple values need to be parsed, the EachKey
function can be used to parse multiple values in a single pass through the JSON string.
If a matching value is found, jsonparser
returns immediately without further traversal. For multiple matches, it traverses the entire JSON string. If a path does not match any value in the JSON string, it still traverses the entire string.
jsonparser
reduces the use of recursion by employing loops during JSON traversal, thus decreasing the call stack depth and enhancing performance.
In terms of functionality, ArrayEach
, ObjectEach
, and EachKey
functions allow for passing a custom function to meet specific needs, greatly enhancing the utility of jsonparser
.
The code for jsonparser
is straightforward and clear, making it easy to analyze. Those interested can examine it themselves.
Sub Summary
The high performance of jsonparser
compared to the standard library can be attributed to:
- Using
for
loops to minimize recursion. - Avoid the use of reflection, unlike the standard library.
- Exiting immediately upon finding the corresponding key value without further recursion.
- Operating on the passed-in JSON string without allocating new space, thus reducing memory allocations.
Additionally, the API design is highly practical. Functions like ArrayEach
, ObjectEach
, and EachKey
allow for passing custom functions, solving many issues in actual business development.
However, jsonparser
has a significant drawback: it does not validate JSON. If the input is not valid JSON, jsonparser
will not detect it.
Performance Comparison
Parsing Small JSON Strings
Parsing a simple JSON string of approximately 190 bytes
Library | Operation | Time per Iteration | Memory Usage | Memory Allocations | Performance |
---|---|---|---|---|---|
Standard Library | Parse to map | 724 ns/op | 976 B/op | 51 allocs/op | Slow |
Parse to struct | 297 ns/op | 256 B/op | 5 allocs/op | Average | |
fastjson | get | 68.2 ns/op | 0 B/op | 0 allocs/op | Fastest |
parse | 35.1 ns/op | 0 B/op | 0 allocs/op | Fastest | |
GJSON | Convert to map | 255 ns/op | 1009 B/op | 11 allocs/op | Average |
get | 232 ns/op | 448 B/op | 1 allocs/op | Average | |
jsonparser | get | 106 ns/op | 232 B/op | 3 allocs/op | Fast |
Parsing Medium JSON Strings
Parsing a JSON string of moderate complexity, approximately 2.3KB
Library | Operation | Time per Iteration | Memory Usage | Memory Allocations | Performance |
---|---|---|---|---|---|
Standard Library | Parse to map | 4263 ns/op | 10212 B/op | 208 allocs/op | Slow |
Parse to struct | 4789 ns/op | 9206 B/op | 259 allocs/op | Slow | |
fastjson | get | 285 ns/op | 0 B/op | 0 allocs/op | Fastest |
parse | 302 ns/op | 0 B/op | 0 allocs/op | Fastest | |
GJSON | Convert to map | 2571 ns/op | 8539 B/op | 83 allocs/op | Average |
get | 1489 ns/op | 448 B/op | 1 allocs/op | Average | |
jsonparser | get | 878 ns/op | 2728 B/op | 5 allocs/op | Fast |
Parsing Large JSON Strings
Parsing a JSON string of high complexity, approximately 2.2MB
Library | Operation | Time per Iteration | Memory Usage | Memory Allocations | Performance |
---|---|---|---|---|---|
Standard Library | Parse to map | 2292959 ns/op | 5214009 B/op | 95402 allocs/op | Slow |
Parse to struct | 1165490 ns/op | 2023 B/op | 76 allocs/op | Average | |
fastjson | get | 368056 ns/op | 0 B/op | 0 allocs/op | Fast |
parse | 371397 ns/op | 0 B/op | 0 allocs/op | Fast | |
GJSON | Convert to map | 1901727 ns/op | 4788894 B/op | 54372 allocs/op | Average |
get | 1322167 ns/op | 448 B/op | 1 allocs/op | Average | |
jsonparser | get | 233090 ns/op | 1788865 B/op | 376 allocs/op | Fastest |
Summary
During this comparison, I analyzed several high-performance JSON parsing libraries. It was evident that these libraries share several common characteristics:
- They avoid using reflection.
- They parse JSON by traversing the bytes of the JSON string sequentially.
- They minimize memory allocation by directly parsing the input JSON string.
- They sacrifice some compatibility for performance.
Despite these trade-offs, each library offers unique features. The fastjson
API is the simplest to use; GJSON offers fuzzy searching capabilities and high customizability; jsonparser
supports inserting callback functions during high-performance parsing, providing a degree of convenience.
For my use case, which involves simply parsing certain fields from HTTP response JSON strings with predetermined fields and occasional custom operations, jsonparser
is the most suitable tool.
Therefore, if performance concerns you, consider selecting a JSON parser based on your business requirements.
Reference
https://github.com/buger/jsonparser
https://github.com/tidwall/gjson
https://github.com/valyala/fastjson
https://github.com/json-iterator/go
https://github.com/mailru/easyjson
https://github.com/Jeffail/gabs
https://github.com/bitly/go-simplejson