Recently deployed web program, there are a lot of time_wait connection status on the server, will occupy the TCP port, cost several days to check.

I concluded earlier that HTTP keep-alive is a sliding renewal multiplexing of TCP connections at the application layer. If the client and server renew their connections steadily, they become truly long connections.

At present, all HTTP network libraries (both client and server) have HTTP keep-alive enabled by default, and use the Connection header of Request/Response to negotiate multiplexing connections.

Short connections caused by unconventional practices

I have a project in hand. Due to historical reasons, the client disabled keep-alive, while the server enabled keep-alive by default. As a result, the negotiation of multiplexing connection fails.

The client forcibly disables keep-alive

package main import ( "fmt" "io/ioutil" "log" "net/http" "time" ) func main() { tr := http.Transport{ DisableKeepAlives:  true, } client := &http.Client{ Timeout: 10 * time.Second, Transport: &tr, } for { requestWithClose(client) time.Sleep(time.Second * 1) } } func requestWithClose(client *http.Client) { resp, Err := client.Get("http://10.100.219.9:8081") if err! = nil { fmt.Printf("error occurred while fetching page, error: %s", err.Error()) return } defer resp.Body.Close() c, err := ioutil.ReadAll(resp.Body) if err ! = nil { log.Fatalf("Couldn't parse response body. %+v", err) } fmt.Println(string(c)) }Copy the code

Keep-alive is enabled on the Web server by default

Package main import (" FMT ""log" "net/ HTTP") r *http.Request) { fmt.Println("receive a request from:", r.RemoteAddr, R.ader) w.rite ([]byte(" OK "))} func main() {fmt.printf ("Starting server at port 8081\n") // net/ HTTP Enables persistent connections by default if err  := http.ListenAndServe(":8081", http.HandlerFunc(IndexHandler)); err ! = nil { log.Fatal(err) } }Copy the code

From the server log, it is indeed a short connection.

receive a request from: 10.22.38.48:54722 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54724 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request from: 10.22.38.48.54726 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54728 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request from: 10.22.38.48:54731 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54733 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request from: 10.22.38.48:54734 Map [accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54738 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request from: 10.22.38.48.54740 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54741 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request from: 10.22.38.48:54743 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1] receive a request from: 10.22.38.48.54744 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]] Receive a request From: 10.22.38.48:54746 map[accept-encoding :[gzip] Connection:[close] user-agent :[go-HTTP-client /1.1]Copy the code

Who are the initiators?

I took it for granted that the client was taking the initiative and getting slapped in the face by reality.

One day more than 300 time_wait alerts on the server told me that the TMD server was actively terminating the connection.

The normal TCP4 wave, active disconnection will enter the time_wait state, wait 2MSL to release the occupied SOCKET

The following TCP connection information is captured from the server tcpdump.

Red box 2 and 3 indicate that a TCP FIN message is sent from the Server. After that, the Client replies an ACK to confirm that it has received a shutdown notification from the Server. The Client then sends a FIN message indicating that the Socket can now be closed. The Server finally sends an ACK to acknowledge receipt and enters the Time_WAIT state to close the Socket for 2MSL.

The red box 1 indicates that both TCP ends are disabled at the same time. In this case, time_wait traces are left on the Client and Server at the same time.

No source code to say a string

In this case, the server is active shutdown, we go back to the source of golang httpServer

  • http.ListenAndServe(“:8081”)
  • server.ListenAndServe()
  • srv.Serve(ln)
  • Go C. Sever (connCtx) uses the GO coroutine to handle each request

Server connection processing request abbreviated source code as follows:

func (c *conn) serve(ctx context.Context) { c.remoteAddr = c.rwc.RemoteAddr().String() ctx = context.WithValue(ctx, LocalAddrContextKey, c.rwc.LocalAddr()) defer func() { if ! c.hijacked() { c.close() c.setState(c.rwc, StateClosed, runHooks) } }() ...... // HTTP/1.x from here on. ctx, cancelCtx := context.WithCancel(ctx) c.cancelCtx = cancelCtx defer cancelCtx() c.r = &connReader{conn: c} c.bufr = newBufioReader(c.r) c.bufw = newBufioWriterSize(checkConnErrorWriter{c}, 4<<10) for { w, err := c.readRequest(ctx) switch { case err == errTooLarge: const publicErr = "431 Request Header Fields Too Large" fmt.Fprintf(c.rwc, "HTTP/1.1 "+publicErr+errorHeaders+publicErr) C.closeWriteAndWait () return case isUnsupportedTEError(err): Code := StatusNotImplemented fmt.Fprintf(c.wc, "HTTP/ 1.1%d %s%sUnsupported transfer encoding", code, StatusText(code), errorHeaders) return case isCommonNetReadError(err): return // don't reply default: if v, ok := err.(statusError); Ok {FMT. Fprintf (c.r wc, "HTTP / 1.1% d % s: % s: % s % d % s %s", v.code, StatusText(v.code), v.text, errorHeaders, v.code, StatusText(v.code), v.text) return } publicErr := "400 Bad Request" fmt.Fprintf(c.rwc, "HTTP/1.1 "+publicErr+errorHeaders+publicErr) return}} serverHandler{c.server}.ServeHTTP(w, w.req) w.cancelCtx() if c.hijacked() { return } w.finishRequest() if ! w.shouldReuseConnection() { if w.requestBodyLimitHit || w.closedRequestBodyEarly() { c.closeWriteAndWait() } return } c.setState(c.rwc, StateIdle, runHooks) c.curReq.Store((*response)(nil)) if ! w.conn.server.doKeepAlives() { // We're in shutdown mode. We might've replied // to the user without "Connection: close" and // they might think they can send another // request, But such is life with HTTP/1.1. return} if d := c.sever.idleTimeout (); d ! = 0 { c.rwc.SetReadDeadline(time.Now().Add(d)) if _, err := c.bufr.Peek(4); err ! = nil { return } } c.rwc.SetReadDeadline(time.Time{}) } }Copy the code

We need to focus on

  1. For loop, which attempts to reuse the CONN to handle the oncoming request
  2. W.shouldreuseconnection () = false, indicating that the Client is readConnection: CloseCloseAfterReply =true, the dor loop is broken, the coroutine is about to end, and executed before the enddeferThe defer function closes the connection
c.close()
......
// Close the connection.
func (c *conn) close() {
	c.finalFlush()
	c.rwc.Close()
}
Copy the code
  1. If w.shouldreuseconnection () = true, the connection state is set to idle and the for loop continues to process subsequent requests.

I harvest

  1. TCP wave four times in octadon
  2. The effect of short connections on the server, time_wait, occupies available sockets and determines whether to switch to long connections based on actual services
  3. Golang HTTP keep-alive reuse TCP connection source level analysis
  4. Packet capture posture in tcpdump