Delayed ACK

A few weeks ago, during a technical group meeting I hold here in town, a question came up that is now the basis of our next meeting. Does every TCP segment get an ACK returned?

My unqualified answer to this question was “No” as I had troubleshot some connections before and noticed that not every TCP segment received was sent an ACK. When challenged on this I couldn’t give the technical background for this just that this was my experience. Of course others in the group had different experiences so the only way to solve this was to do the research.

After some research and lots of time sidetracked on other TCP concepts I finally found what I was looking for.

RFC 1122 – Transport Layer -TCP Section 4.2.3.2 When to Send and ACK Segment

http://tools.ietf.org/html/rfc1122

To summarize this section; it is recommended, to increase efficiency, that not every data segment received be sent an ACK. The technical name of this behavior is “delayed ACK”. Now this process does have some strings attached. The rule is that the delayed ACK must not exceed .5 seconds and you should try to send an ACK for every second data segment received.

On large networks this will substantially reduce network traffic but in my experience could cause other issues. In one particular case I worked on, the webservers had to be configured to override the default 2 packet delayed ACK to maintain their TCP session with the load balancers. The network environment was a SaaS server farm with Windows web servers behind two NetScaler load balancers. By default Windows server send an ACK out for every second data segment or at 200 milliseconds, whichever comes first. For some reason this was not quick enough for the load balancers and they would send very rapidly several PSH commands then a FIN to close the connection.

We had two options available.

1. Find the configuration setting on the load balancers to adjust the time frame for this connection.

or

2. Find the setting on the webserver and adjust it to ACK each packet.

We chose number 2 and added the following registry key.

Subkey:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip

\Parameters\Interfaces\<Interface GUID>\TcpAckFrequency

http://msdn.microsoft.com/en-us/library/aa505957.aspx

http://support.microsoft.com/kb/328890

This corrected our problem with the load balancers but slightly increased our network traffic. Since this traffic was localized we accepted that solution with the focus that we find the root cause on the load balancers and make the changes at that layer.

 

Leave a Reply