Featured image of post Go High-Performance Programming EP7: Use singleflight To Merge The Same Request

Go High-Performance Programming EP7: Use singleflight To Merge The Same Request

Understanding Cache Penetration and Mitigation Strategies

 

Using cached database data to accelerate queries is common in developing business applications. However, this architecture presents challenges like cache penetration, avalanche, and cache breakdown. Cache breakdown occurs in high-concurrency systems when numerous requests simultaneously query an expired key, leading to a surge in database requests and significantly increasing the database load. If we can combine multiple identical requests into one within a short period, the database load can be reduced from N requests to just 1.

This article was first published in the Medium MPP plan. If you are a Medium user, please follow me on Medium. Thank you very much.

SingleFlight is a concurrency primitive that addresses this problem. This article introduces the usage and basic principles of SingleFlight and details its implementation.

Figure 1: Merging Identical Requests with SingleFlight

Pasted image 20240723160249

Using SingleFlight

The Go language team has placed singleflight in the golang.org/x directory, indicating that it may undergo future changes. This package provides a mechanism to suppress redundant function calls, allowing for the prevention of simultaneous identical function calls. If a function is called once, subsequent duplicate calls wait, and once the first call completes, all calls receive the result. Thus, although the function is executed only once, all callers get the final result.

Listing 1: Merging Database Requests with SingleFlight

https://gist.github.com/hxzhouh/bf7d6276d6a703628abbd812964c3081

As shown, we simulated five requests with a cache miss, but only one request accessed the database. All five requests received the same result. This optimization is significant for backend services as it prevents the database from being overwhelmed by cache penetration.

In the Go source code, you can see the use of SingleFlight, for example, in /src/net/lookup.go#L165 and /src/cmd/go/internal/vcs/vcs.go#L1385. The caching library groupcache also uses a similar implementation to prevent cache penetration.

SingleFlight Analysis

The singleflight package defines a structure type named Group, representing a category of work and forming a namespace where duplicate suppression can execute work units. The primary data structure of SingleFlight is Group, which provides three methods:

Pasted image 20240723162629

  • Do: This method executes a function and returns its result. You need to provide a key; for the same key, only one function executes at a time, while concurrent requests wait. The result of the first executed request is the return result. The fn function is parameterless and returns a result or error, while the Do method returns the function’s result or error, and shared indicates whether v is returned to multiple requests.
  • DoChan: Similar to the Do method, but returns a channel. Once the fn function completes and produces a result, it can be received from this channel.
  • Forget: Instructs the Group to forget a key, allowing future requests for this key to execute fn instead of waiting for the result of a previously incomplete fn function.

Implementation Details

Request Blocking

Internally, singleflight uses WaitGroup to block all subsequent requests with the same key except the first one. Other requests are returned only after the first request is executed and returns from fn. If fn takes a long time to execute, all subsequent requests will be blocked. In this scenario, you can use DoChan with time.After + select for timeout control.

list2: Use doChan to control the timeout.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// list2: user doChan  
func doChan(name string) User {  
  
    result := g.DoChan("user", func() (interface{}, error) {  
       data := GetUserFromDB(name)  
       return data, nil  
    })  
    select {  
    case v := <-result:  
       log.Println("get user from db")  
       return v.Val.(User)  
    case <-time.After(1 * time.Second):  
       log.Println("timeout")  
       return User{}  
    }  
}

Forget

The singleflight implementation dictates that if the first request fails, all waiting requests return the same error. To address this, you can periodically forget a key based on database usage, allowing more requests to reach subsequent logic.

1
2
3
4
go func() {
       time.Sleep(100 * time.Millisecond)
       g.Forget(name)
   }()

For example, if 100 requests come in within one second, normally, only the first request executes GetUserFromDB, while the subsequent 99 requests are blocked. By adding this Forget mechanism, one request executes GetUserFromDB every 100ms, providing multiple attempts and placing a greater load on the database. This trade-off needs to be carefully considered based on the specific scenario.

Last updated on Jul 23, 2024 16:47 CST
Built with Hugo
Theme Stack designed by Jimmy