Goroutine Leak Detection in Go

 

Goroutine Leak Detection in Go

1. Overview

Goroutine leak detection is an experimental feature in Go that identifies leaked goroutines using the garbage collector (GC). A goroutine is considered leaked if it is blocked indefinitely on a synchronization primitive (channel, mutex, etc.) that is no longer reachable by any other part of the program.

2. Key Concepts

2.1 Goroutine Leaks

A goroutine leak occurs when:

  • A goroutine is blocked on a synchronization primitive
  • The synchronization primitive becomes unreachable
  • No other goroutine can wake up the blocked goroutine

Common causes:

  • Forgotten channel sends/receives
  • Unclosed channels with pending operations
  • Mutexes held indefinitely
  • WaitGroups not properly handled

2.2 Sudog Structure

The sudog struct is a critical component in Go’s runtime that links goroutines to synchronization primitives:

1
2
3
4
5
6
7
8
type sudog struct {
    g              *g
    elem           atomic.Pointer[unsafe.Pointer]
    c              atomic.Pointer[*hchan]
    waitlink       *sudog
    isSelect       bool
    // ... other fields
}
Sudog Structure sudog g: *g // Goroutine waiting on the sync primitive elem: atomic.Pointer[unsafe.Pointer] // Data element being sent/received c: atomic.Pointer[*hchan] // Channel the goroutine is waiting on waitlink: *sudog // Next sudog in the wait queue isSelect: bool // Whether this is for a select statement g Goroutine hchan Channel Data Element

3. Mechanism

The goroutine leak detection mechanism involves a specialized GC cycle with the following steps:

3.1 Normal Vs Leaked Goroutine

Goroutine Lifecycle Normal Goroutine Created Running Completed Leaked Goroutine Created Running Blocked on Channel Unreachable Channel Leaked

3.2 Detection Process

GC Goroutine Leak Detection Step 1: GC Marking Mark all reachable objects Goroutines and their stacks Synchronization primitives Step 2: Sync Objects Untraceable Make sync objects untraceable Prevent false marking of leaked objects Step 3: Find Leaked Goroutines Identify goroutines blocked on unreachable sync primitives Step 4: Mark as Leaked Set goroutine status to _Gleaked Update internal leak count Step 5: Restore Sync Objects Make sync objects traceable again Prepare for next GC cycle Legend: GC Marking Sync Objects Untraceable Find Leaked Goroutines Mark as Leaked

3.3 Key Functions

  1. runtime/goroutineLeakGC(): Initiates a GC cycle with leak detection
  2. runtime/setSyncObjectsUntraceable(): Makes synchronization objects untraceable
  3. runtime/findGoroutineLeaks(): Identifies leaked goroutines
  4. runtime/findMaybeRunnableGoroutines(): Filters out runnable goroutines
  5. runtime/gcRestoreSyncObjects(): Restores sync objects to traceable state

4. Usage

4.1 Enabling the Feature

Goroutine leak detection is an experimental feature enabled by the GOEXPERIMENT=goroutineleakprofile environment variable.

4.2 Pprof Integration

The feature integrates with Go’s pprof tool through the /debug/pprof/goroutineleak endpoint:

pprof Integration Client HTTP Request /debug/pprof/goroutineleak wget/curl/browser Server HTTP Server net/http/pprof pprof Handler Runtime 1. runtime_goroutineLeakGC() 2. findGoroutineLeaks() 3. runtime_goroutineleakcount() 4. writeGoroutineLeak() HTTP Request Call Response Example Usage: go tool pprof http://localhost:6060/debug/pprof/goroutineleak

4.2.1 Command Line

1
go tool pprof http://localhost:6060/debug/pprof/goroutineleak

4.2.2 Programmatic Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import "net/http"
import _ "net/http/pprof"

func main() {
    // Start pprof server
    go func() {
        log.Println(http.ListenAndServe("localhost:6060", nil))
    }()

    // ... your application code
}

5. Implementation Details

5.1 Runtime Changes

The feature requires several changes to the Go runtime:

  1. Atomic Pointers in Sudog: The elem and c fields of sudog are now atomic pointers to support safe concurrent access.

  2. Specialized GC Cycle: A new type of GC cycle is introduced for leak detection that:

    • Marks all reachable objects
    • Makes synchronization objects untraceable
    • Identifies goroutines blocked on unreachable primitives
    • Marks these goroutines as leaked
  3. Goroutine Status: A new goroutine status _Gleaked is added to mark leaked goroutines.

5.2 Pprof Integration

The pprof package is extended to support the new “goroutineleak” profile type, which:

  • Runs a GC cycle with leak detection
  • Collects stack traces of leaked goroutines
  • Formats the output in the standard pprof format

6. Limitations

  • The feature is experimental and disabled by default
  • It only detects goroutines blocked on synchronization primitives
  • It may have performance overhead during the specialized GC cycle
  • It’s not guaranteed to detect all types of goroutine leaks

7. Future Directions

  • Improve accuracy and performance
  • Add support for detecting more types of leaks
  • Integrate with other Go tools
  • Make the feature enabled by default

8. Conclusion

Goroutine leak detection is a powerful tool that helps identify one of the most common performance issues in Go applications. By leveraging the garbage collector to detect unreachable synchronization primitives, it provides an automated way to find and fix leaked goroutines.

The feature is still experimental but shows great promise for improving the reliability and performance of Go applications.


Note: This document describes an experimental feature in Go. The implementation details and API may change in future versions of Go.

Licensed under CC BY-NC-SA 4.0
Built with Hugo
Theme Stack designed by Jimmy