Background
In the previous article, we learned that panic can occur in three ways:
- Initiated by developers: by calling the
panic()
function. - Generated by the compiler: for example, in the case of division by zero.
- Sent to the process by the kernel: for example, in the case of an illegal memory access.
This article is first published in the medium MPP plan. If you are a medium user, please follow me in medium. Thank you very much.
All three cases can be categorized as calls to the panic()
function, indicating that panic in Go is just a particular function call and is handled at the language level. Now that we know how panic is triggered, the next step is to understand how panic is handled. When I first started learning Go, I often had some questions in my mind:
- What exactly is panic? Is it a struct or a function?
- Why does panic cause the Go process to exit?
- Why does recover need to be placed inside a defer statement to take effect?
- Even if recover is placed inside a defer statement, why doesn’t the process recover?
- Why is it possible to panic again after a panic? What are the implications?
Today, we will delve into the code to clarify these questions.
Based on Go 1.21.4
The _panic
Data Structure
Let’s start by looking at an example of an actively triggered panic. Through the assembly code, we can see that the panic call is made to the runtime.gopanic
function, which contains a crucial data structure called _panic
.
Let’s take a look at the _panic
data structure:
|
|
Key Fields to Focus On:
link
: A pointer to the_panic
structure, indicating that_panic
can form a unidirectional linked list, similar to the_defer
list.recovered
field: The recovery of the so-called_panic
depends on whether this field is set to true.recover()
actually modifies this field.
Now let’s take a look at two important fields in the g
structure:
|
|
From here, we can see that both the _defer
and _panic
linked lists are attached to the goroutine. When can there be multiple elements on the _panic
linked list? The answer is when a panic()
call is made within a defer function. Only in a defer function can an _panic
linked list be formed because the panic()
function only executes _defer
functions!
The recover()
Function
For the sake of explanation, let’s start with a simple analysis of what the recover()
function does:
|
|
The recover()
function corresponds to the gorecover
function implementation in the runtime/panic.go
file.
The gorecover
Function
|
|
This function is quite simple:
- Retrieve the current goroutine structure.
- Get the latest
_panic
from the_panic
linked list of the current goroutine. If it is notnil
, proceed with the processing. - Set the
recovered
field of the_panic
structure to true and return thearg
field.
That’s all there is to the recover()
function. It simply sets the value of the recovered
field in the _panic
structure and does not involve any magical code jumps. The assignment of recovered
is effective within the logic of the panic()
function.
The panic()
Function
Based on the previous assembly code, we know that the panic call is made to the runtime.gopanic
function.
The gopanic
Function
The most important part of the panic mechanism is the gopanic
function. All the details about panic are in this function. The complexity of understanding panic lies in two points:
- Recursive execution of
gopanic
when panic is nested. - The program counter (pc) and stack pointer (sp) are not manipulated in the usual way of function call and return. During recovery, the instruction registers are modified directly, bypassing the remaining logic of
gopanic
and even the recursive logic of multiplegopanic
calls.
We can understand the logic of gopanic
by dividing it into two parts: inside the loop and outside the loop.
Inside the Loop
The actions inside the loop can be divided into the following steps:
- Iterate through the defer linked list of the goroutine and retrieve a
_defer
deferred function. - Set the
d.started
flag and bind the currentd._panic
(used for recursive detection). - Execute the
_defer
deferred function. - Remove the executed
_defer
function from the linked list. - Check if the
recovered
field of the_panic
structure is set to true and take appropriate action.- If it is true, reset the pc and sp registers (usually starting from the deferreturn instruction) and enqueue the goroutine in the scheduler for execution.
- Repeat the above steps.
Questions to Consider
You may notice that the recovered
field is only modified during the third step. It cannot be modified anywhere else.
Question 1: Why does recover need to be placed inside a defer statement to take effect?
Because that’s the only opportunity!
Let’s consider a few straightforward examples:
|
|
In the above example, recover()
is called after panic()
. Why does it still panic? Because the recover()
function is never executed. The execution order is as follows:
|
|
Someone might argue, “What if I put recover()
before panic('test')
?”
|
|
No, it won’t work because when recover()
is executed, the _panic
is not attached to the goroutine yet. So recover()
is useless in this case.
Question 2: Even if recover is placed inside a defer statement, why doesn’t the process recover?
Let’s recall the operations in the for loop:
|
|
Key Point: In the gopanic
function, only the _defer
functions of the current goroutine are executed. Therefore, if a recover()
is performed in a defer function attached to another goroutine, it will have no effect.
Let’s consider an example:
|
|
Since the panic
and recover
are in two different goroutines, the _panic
is attached to g1, and the recover
in g2’s _defer
chain cannot access the _panic
structure of g1. Therefore, it cannot set the recovered
field to true, and the program still crashes.
Question 3: Why is it possible to panic again after a panic? What are the implications?
This is actually quite easy to understand. Some people might overthink it. Can we call panic()
recursively? Yes, we can.
The scenario is usually as follows:
- The
gopanic
function calls a_defer
function. - The
_defer
function callspanic()
orgopanic()
.
This is just a simple function recursion, and there’s nothing special about it. In this scenario, an _panic
linked list will be formed starting from gp._panic
. The instructions executed by gopanic
are special in two ways:
- If the
_panic
structure is set to recovered, the pc and sp registers are reset, bypassinggopanic
(including nested function stacks), and jumping directly to the instruction to be executed after the defer function (deferreturn). - If there is no handler for the
_panic
data, exit the process and terminate the execution of subsequent instructions.
Let’s look at an example of nested panic:
|
|
The function execution is as follows:
|
|
Here’s another example for comparison:
|
|
Will this function print the stack trace and exit?
The answer is no. The output will be:
|
|
Did you guess it correctly? Let me explain the complete route:
|
|
Here’s another example:
|
|
Will this function print the stack trace and exit?
The answer is yes.
The output will be:
|
|
The execution path is as follows:
|
|
Did you guess correctly?
The recovery
Function
Finally, let’s take a look at the crucial recovery
function. In the gopanic
function, when executing the defer functions in the loop, if the recovered
field of the _panic
structure is set to true, the mcall(recovery)
function is called to perform the so-called “recovery.”
Let’s take a look at the implementation of the recovery
function. It’s a very simple function that resets the pc and sp registers and reschedules the Goroutine for execution.
|
|
Resetting the pc and sp registers means what? The pc register points to the address of the instruction, in other words, it jumps to another location to execute instructions. It no longer executes the instructions sequentially after gopanic
. The _defer.pc
is the instruction line of the executed code, and where is this instruction?
For this, let’s recall the chapter on defer
. When a deferred function is registered, it corresponds to a _defer
structure. When creating this structure, the _defer.pc
field is assigned the instruction on the next line after the new
function. This was explained in detail in the chapter on “In-Depth Analysis of Defer”.
Here’s an example: if it’s allocated on the stack, it will be in deferprocStack
. So, mcall(recovery)
jumps to this position, and the subsequent logic follows the deferreturn
logic, executing the remaining _defer
function chain.
That concludes the explanation of panic. It’s just a special function call, nothing special. The only thing that makes it special is the special instruction jumps it performs.