Mixing TCP and Legacy Protocols

There is no doubt that IP is the dominant networking protocol, and that TCP is the dominant transport protocol. That is how it should be. However, some networks still contain other protocols that can disrupt TCP data flows. If you do not use UDP, NetBIOS, IPX, SPX, Banyan Vines, AppleTalk, SNA or 802.2 protocols to carry application payloads on your network, you may want to skip this section. Most likely you have a policy that restricts these protocols from existing on your network unless encapsulated within TCP. This is a good thing.

Performance issues

If you are still hanging onto legacy protocols, no doubt you have been mixing TCP on the same network. This often causes performance issues that are difficult to understand. Packet loss is high, and some TCP sessions may be very slow and may even time out. Even your legacy protocols may be unreliable. The problem is that TCP is designed to self tune itself with other TCP streams, not other protocols. Consider the following diagram:

Diagram showing TCP and IPX stacks connected to the same NIC which is connected at 1Gbps to a switch and the switch connected to another station at 56Kbps.

Notice that we have an IP protocol stack and an IPX stack. Both stacks are connected via a NIC card, to a 1Gbps network. A programmer has the option of writing to the TCP, UDP, raw IP, SPX or raw IPX APIs. If the target requests a large block of data, each protocol will transmit it differently. Raw IP, UDP and IPX will make no adjustments for the fact that the request may have come from a slower downstream link. These protocols will send the data at 1Gbps, and expect the downstream devices to buffer all data without overruns. The application feels back pressure while it is waiting for the packet to be transmitted, but once the packet is on the 1Gbps network, it will send the next packet.

TCP and SPX (and others) have a receive window which restricts flooding of the end-point of a network. The size of the receive window is typically between 8Kbytes and 16Kbytes. Once this amount of data is on the wire without being acknowledged, transmission is paused, which applies back pressure to the application. The receive window values and associated sequence numbers are visible in a protocol analyzer trace.

Tuning of networks based on receive windows does not scale. Only the receiving station is in control of how big the receive window is. If a router in the middle of a network is handling 1,000 sockets and is getting into trouble, it cannot adjust the receive windows to prevent saturation. In essence, the receive window mainly protects the interface between the transport API and the receiving application, but does little to protect the network.

TCP’s transmit window

TCP is unique in its implementation of an additional algorithm to protect the network. Commonly known as "slow start", this algorithm employs a transmit window that is maintained on the sending station. Unlike the receive window, this window is not directly visible on a protocol analyzer. The transmit window starts off small, and will restrict the first transmission to only 1 packet. If it is acknowledged, the window will grow to 2 packets, then 4, then 8 (depending on implementation) and will continue to grow until one of two things happens. Once the transmit window grows larger than the receive window, the socket will be controlled by the receive window. However, if an acknowledgement is lost, the assumption is made that there was congestion in the middle of the network, and the size of the transmit window is cut in half. By doing so, the socket slows down to prevent even more flooding of the network.

TCP’s implementation of the transmit window, first introduced in 1988, is very significant. Before then, networks seemed to break down when they grew too large. For example, IPX networks rarely grow beyond 20,000 stations. Slow-start allows us to connect millions of people via one network, the internet.

Mixing multiple protocols

Now, what happens if we mix multiple protocols on the same network? For example, let’s say that we implemented NFS with an underlying UDP stack rather than TCP. When TCP and UDP meet on the wire, UDP will not be as courteous as TCP. UDP will transmit regardless of the congestion in the middle of the network. TCP, on the other hand, will not transmit when it sees congestion caused by UDP. UDP is blocking TCP. In a mixed protocol environment, we commonly observe traffic blocked for 5 to 8 second periods. Packet loss is very high, and many users are complaining. Stations with faster NIC cards tend to do the most damage.

Here are some recommendations:

  • Encapsulate NetBIOS into TCP. On Microsoft platforms, this is done by only including the TCP protocol and removing any references to NetBIOS or NetBEUI
  • If using UDP, only send datagrams (such as DNS), or employ QoS to prevent flooding. Today’s VoIP implementations employ QoS.
  • Convert Novell networks from IPX to TCP
  • Disable legacy protocols on workstations, routers and printers
  • Monitor networks for rogue protocols. Often IPX is found when users attempt to play games across your network.