What

Learn how to shut down Go HTTP servers so as to release resources properly.

Why

We started writing Go in 2014 for web services to support our mobile app. By “writing”, I mean slapping together code based on online blog posts and overly simplistic or out-of-date tutorials. What joy it was to crank out relatively straightforward, working code that replaced our frustrating Node/Express servers on production within a few days!

(This is a testament to Go’s thoughtful design, although the propensity for JavaScript promise chains to quietly swallow errors contributed some fuel to my fire for Go.)

Over time, as more and more of our dependencies required using context, I realized it was time to grow up and move on from defaulting to context.WithTimeout() everywhere. While I did implement graceful shutdown using the http.Server method Shutdown(), along with a channel that waited until SIGINT was received to initiate shutdown, that design would not support a proper use of a context that could manage the lifetime of the entire server including its requests and other resources.

I spent several hours searching and experimenting on best approaches, and have shared some of what learned below.

Step by step

To be as easy to read as possible, I have taken shortcuts in my code samples, such as leaving errors unhandled.

A simple use of http.Server

This starts a server and tries to print to stdout to describe what happens along the way.

func main() {
        srv := http.Server{Addr: ":8080"}

        // Verify that writing to stdout works after the server ends.
        defer func() {fmt.Println("Stopped.")}()

        // Ensure that writing to stdout works before starting the server.
        fmt.Println("Starting…")

        // ListenAndServe blocks execution until the Go runtime responds
        // to a SIGINT or SIGQUIT (or the like) to exit.
        srv.ListenAndServe()
}

After running the program, we Control-C to exit.

$ go run .
Starting…
^Csignal: interrupt

So clearly we can write to stdout successfully, but when the Go runtime detects the SIGINT it immediately exits the program. Nothing else prints. Even the deferred function doesn’t appear to run.

Listening for the signal to shutdown

Rather than letting the Go runtime handle SIGINT and immediately terminate all server activity ungracefullly, let’s start by piping the signal to a channel. We can then use the channel to control how the server shuts down.

func main() {
    quit := make(chan os.Signal, 1)

    // Send to the channel when SIGINT is detected.
    // Prevents the signal from reaching the Go runtime.
    signal.Notify(quit, os.Interrupt)

    // Blocks execution until channel receives the signal.
    <-quit

    // Execution is unblocked and program exits.
}

Great! This gives us a way to control when and how to release resources and shut down the server.

Notifying the server to shut down gracefully

Here we are using ListenAndServe() to start the server, but we need a way to shutdown the server. The http.Server type offers Close(), which ungracefully closes all connections immediately, and Shutdown(), which allows the server to wait for active connections to complete before closing them.

Since ListenAndServe() blocks further execution, we’ll need an independent flow of execution that can initiate the server shutdown. Let’s use a goroutine.

func main() {
    srv := http.Server{Addr: ":8080"}

    // Use a goroutine to prepare for shutdown because ListenAndServe()
    // will block, preventing any attempt to shutdown after it in the
    // program flow.
    go func() {
        srv.ListenAndServe()
        fmt.Println("Server stopped listening.")
    }()

    // Block and wait for SIGINT before initiating graceful shutdown.
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, os.Interrupt)
    <-quit

    // Initiates shutdown of HTTP server.
    srv.Shutdown(context.TODO())
    fmt.Println("Server shutdown complete.")
}

Running this and pressing Control-C to send the interrupt results in

Server stopped listening.
^CServer shut down complete.

Initially, I called Shutdown() in the anonymous function, and put ListenAndServe() in the main goroutine, which caused the program to exit right after ListenAndServe() returns, but the docs instruct:

When Shutdown is called, Serve, ListenAndServe, and ListenAndServeTLS immediately return ErrServerClosed. Make sure the program doesn’t exit and waits instead for Shutdown to return.

This leads us to the next issue: cleaning up resources after all the connections have been closed and the server is shut down.

Where should resources be released?

Because Shutdown() immediately unblocks ListenAndServe(), anything after it will run, even if shutdown has not finished. Any active connections that depend on some resource may fail if that resource is released before it is called by the active request. We end up ending ungracefully again.

Not RegisterOnShutdown()

While there actually is a function called RegisterOnShutdown() and it initially appeared to be the right place to release resources, there are two reasons why I avoid it here.

  1. The registered functions are called concurrently while connections are closing. Using the registered functions to release resources that the open connections depend on would result in ungraceful behavior.
  2. Shutdown() does not wait for registered functions to complete, so the program could exit before resources have been released.

The http package Godoc recommends using RegisterOnShutdown to notify connections to shut down.

The caller of Shutdown should separately notify such long-lived connections of shutdown and wait for them to close, if desired. See RegisterOnShutdown for a way to register shutdown notification functions.

Using contexts to ensure shutdown

A Go context is used to manage the lifetime of some goroutine. It’s part of the standard library, but not part of the language, in case you’re wondering why it can seem more explicit than necessary.

func main() {
    ctx, cancel := context.WithTimeout(context.TODO(), 5 * time.Second)

    // Ensure that we release any resources before exiting.
    defer cancel()

    // Blocks until the timeout occurs.
    <-ctx.Done()
}

You can derive a new context from an existing one to establish a tree of where a cancellation can affect its derived contexts. This code sample behaves the same as the previous one:

func main() {
    ctx, cancel := context.WithCancel(context.TODO())
    defer cancel()

    go func() {
        time.Sleep(5 * time.Second)
        cancel()
    }()

    // Blocks until cancel is called.
    <-ctx.Done()
}

Request contexts Request.Context() are derived from the server’s base context defined by the Server.BaseContext() constructor. The default constructor does not support cancellation or timeout, but we can supply a constructor that that does, which can affect all the requests contexts for that server.

func main() {
    start := time.Now()
    defer func() {
        fmt.Printf("Completed in %v\n", time.Now().Sub(start))
    }

    ctx, cancel := context.WithCancel(context.TODO())
    defer cancel()

    srv := http.Server{
        Addr:        ":8080",
        BaseContext: func(l net.Listener) context.Context { return ctx },
    }

    http.HandleFunc("/slowRequest", func(rw http.ResponseWriter, r *http.Request) {
        // This request's context is created by the server
        // base context constructor.
        select {
        case <-time.After(10 * time.Second):
        case <-r.Context().Done():
        }
    })

    http.HandleFunc("/cancel", func(rw http.ResponseWriter, r *http.Request) {
        // This cancels the server's base context, which then cancels all
        // the requests that depend their requests' context, including the
        // above shutdown request handler.
        cancel()
    })

    go func() {
        srv.ListenAndServe()
    }()

    go func() {
        time.Sleep(time.Second)
        http.Get("http://:8080/slowRequest")
    }

    srv.Shutdown(context.TODO())
}

When we start the server and send a request to /shutdown, we’ll notice that the request doesn’t return a response. curl is left waiting. Then a request to /cancel will cancel the base context from which the /shutdown request context is derived.

When we call /cancel and then /shutdown, the request context Done() channel will not block, and the server can proceed to shutdown.

Using contexts to cancel things beyond this server

Many of the services that your server may be dependent take a context to manage the lifetime of the request to service. See examples:

Not only do these clients initiate a connection using a context, they tend to use contexts for managing lifetime of all operations, including closing the connection.

Therefore, canceling a base context can propagate cancellation to the request, and then on to some external dependency through the client.

A real example is the MongoDB Go driver, which takes advantage of contexts by detecting a cancellation, and using it to immediately close this query’s database connection instead of waiting for the blocking query to return.

If we want to wait until the connections have all closed, we should put this code after the call to Shutdown().

After Shutdown() returns

When Shutdown() returns, we can trust that all the connections have closed, so no connections would still be using that resource, and it can be safely freed.

type client struct{}

func (c client) Close() {
    time.Sleep(time.Second)
    fmt.Println("Client closed.")
}

func main() {
    srv := http.Server{Addr: ":8080"}
    resource := client{}

    go func() {
        srv.ListenAndServe()
        fmt.Println("Server stopped listening.")
    }()

    quit := make(chan os.Signal, 1)
    signal.Notify(quit, os.Interrupt)
    <-quit

    srv.Shutdown(context.TODO())
    resource.Close()

    fmt.Println("Server shutdown complete.")
}

Ensuring complete shutdown

What if the server is unresponsive and its connections never close? The previous code snippet does not release resources if Shutdown() does not return, which can happen. For example, Kubernetes will send SIGTERM and wait for some time before sending SIGKILL, as defined by terminationGraceSeconds in the Kubernetes podSpec

You can imagine the trouble if our server is restarted over and over, all the while reserving resources and not releasing them until they time out (instead of eager notification of cancellation).

How do we cancel the connections that refuse to close even though shutdown was initiated for some time? Let’s look at contexts.

How much grace is too much?

The Shutdown() method blocks indefinitely until all the server’s active connections become idle and close. Should we specify a timeout for how long to wait? Kubernetes already has a termination policy that we can specify outside of the program. If we rely on it, however, we are suboptimally releasing resources.

Recap

We want an HTTP server that can gracefully shutdown.

  1. Shutdown should commence upon receiving SIGINT (using Control-C in a local development environment) or SIGTERM (used by Kubernetes and Docker Compose) to stop a container.
  2. Open connections should be allowed to complete within a certain amount of time before resources are released.
  3. Resources should be given a certain amount of time to be released before the process is killed.
type client struct{}

func (c client) SlowQuery(ctx context.Context) {
    select {
    case <-time.After(10 * time.Second):
        log.Println("query completed")
    case <-ctx.Done():
        log.Println("query canceled")
    }
    return
}

func (c client) Release() {
    log.Println("resource released.")
}
func main() {
    start := time.Now()
    defer func() {
        fmt.Printf("Completed in %v\n", time.Now().Sub(start))
    }()

    ctx, cancel := context.WithCancel(context.TODO())
    defer cancel()

    srv := http.Server{
        Addr:        ":8080",
        BaseContext: func(l net.Listener) context.Context { return ctx },
    }
    srv.RegisterOnShutdown(func() { time.Sleep(4 * time.Second); cancel() })

    http.HandleFunc("/slowOperation", func(rw http.ResponseWriter, r *http.Request) {
        // This request's context is created by the server
        // base context constructor.
        slowness := 10 * time.Second
        fmt.Printf("starting slow request. should take %v\n", slowness)
        select {
        case <-time.After(slowness):
        case <-r.Context().Done():
        }
    })

    http.HandleFunc("/cancel", func(rw http.ResponseWriter, r *http.Request) {
        // This cancels the server's base context, which then cancels all
        // the requests that depend their requests' context, including the
        // above shutdown request handler.
        cancel()
    })

    go func() {
        srv.ListenAndServe()
    }()

    // Wait to ensure the server is listening before sending a request.
    time.Sleep(time.Second)

    wg := sync.WaitGroup{}
    wg.Add(1)

    // Send a request that should keep the connection active for a while,
    // blocking graceful shutdown.
    go func() {
        http.Get("http://:8080/slowOperation")
        wg.Done()
    }()

    wg.Wait()
    srv.Shutdown(context.TODO())
}

Goodbye

The amount of Go code required to add specific server features seems not much more than what a DSL might need to define that behavior. While it could be helpful to implement a robust server and share it as a reusable library or framework, that would still need documentation to explain how it works and seems quite brittle.

In the end, understanding the capabilities of the standard tools available will be more beneficial than introducing yet another dependency.