Saturday, August 16, 2008

Lots of TIME_WAIT in Your Netstat

My colleague is managing a pretty busy web server on behalf of his customer. He realised that there is a lot of TIME_WAIT state in your "netstat" output. If you are visiting a web server via a browser, by default it will initiate 2 concurrent connections to your web server to download all the dependencies (images, css, js, swf, ...). Also, the connection will be keep-alive for a while (see HTTP header: 'Connection: Keep-Alive'). If you are not doing any surfing to the web site, the web server will close the two connections. In TCP terminology, the web server is doing an "active close". In this case, the connection in the web server will change the connection state from ESTABLISHED to TIME_WAIT. By default, it will wait for 2*MSL (maximum segment lifetime) before it will recycle that ephemeral port. In the older version of Solaris, 2*MSL is set to 240 seconds. In Solaris 10, it is set to 60 seconds

If the environment is within a LAN (eg, an app server or db server in a 3-tier architecture), you can set the time-wait to something lower than 60 seconds

$ uname -a
SunOS y5.1our-web-server 0 Generic_118822-11 sun4u sparc SUNW,Sun-Fire-V440

$ ndd /dev/tcp tcp_time_wait_interval
60000


The above TCP state diagram is extracted from here

You can simulate this from command line too. With 'Connection: Keep-Alive', you will see that after the web server serves you the HTML file, it will stay connected until it timeout. If you go to the web-server, you will realised the connection is in ESTABLISHED state, then followed by TIME_WAIT once it is closed'. If you wait for a while (in my case, 60 seconds), the connection is gone.

client$ $ telnet your-web-server 80
Trying 203.166.136.32...
Connected to your-web-server.
Escape character is '^]'.
GET / HTTP/1.1
Host: your-web-server
Connection: Keep-Alive

HTTP/1.1 200 OK
...
...


your-web-server$ netstat -n -P tcp

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q  State
-------------------- -------------------- ----- ------ ----- ------ -------
...
10.0.11.195.80       xxx.xxx.xxx.xxx.60601   17640      0 50400      0 ESTABLISHED
...


your-web-server$ netstat -n -P tcp

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q  State
-------------------- -------------------- ----- ------ ----- ------ -------
...
10.0.11.195.80       xxx.xxx.xxx.xxx.60601   17640      0 50400      0 TIME_WAIT
...


your-web-server$ netstat -n -P tcp

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q  State
-------------------- -------------------- ----- ------ ----- ------ -------
...
...

Happy TIME_WAIT-ing.... :-)

Labels:

4 Comments:

Blogger surajz said...

Thanks, Excellent post. This was exactly what I was looking for. How do you set TIME_WAIT in sunOs?

12:02 AM  
Blogger chihungchan said...

To set it to 30 seconds (30,000 milliseconds), you need to login as root to run this:

ndd -set /dev/tcp tcp_time_wait_interval 30000

This works for Solaris, not sure about which version of the SunOS you are talking about.

8:20 AM  
Blogger surajz said...

Thanks.

We are using solaris 9.

Is there a way to limit maximum time the socket is open. For example, I want to disable JSP pages taking longer than 2 minutes to load. I have tried passing passing flag to application server -Dsun.net.client.defaultReadTimeout=120000
but it does not work.

11:47 PM  
Blogger chihungchan said...

According to the documentation, "sun.net.client.defaultReadTimeout specifies the timeout (in milliseconds) when reading from input stream when a connection is established to a resource. ", it should do the job. I don't think there is anything else from OS that can control the timeout.

2:30 PM  

Post a Comment

<< Home