Advanced Analysis "Excessive packet round-trip time (RTT) detected"

The "Excessive packet round-trip time (RTT) detected" diagnostic indicates that a packet has survived an unreasonably long time based on the characteristics of the network path. For example, a packet that survives for 3 seconds on a small LAN would trigger this diagnostic message. This condition usually indicates the existence of intermittent media errors, or an over-queued network condition. Symptoms are unnecessary traffic duplication and confusing TCP windowing conditions which in turn results in slow response times for the end user.

Servers and queues

To understand the course of action required to tune a network that exhibits excessive RTT, one must understand the effects of queues on the network. To do this, we will look at a simple queuing model, and use terms commonly used in many networking queuing theory textbooks. In its simplest form a gateway/router/switch port can be represented in its two simplest components; a server and a queue.

Diagram showing packets traversing a router. The router is made up of two components: queue and server.

The "server" is a processor that puts bits onto the down-stream wire. The "queue" is the memory that will hold packets that are waiting to be processed by the server. If the queue is full when a packet arrives, a packet is lost (typically the packet at the front of the queue is dropped to accommodate the incoming packet).

In older switches and routers, queues have been known to be very large (we have measured up to 8Mbits). However, TCP flows do not work well if queues are too large. In modern switches/routers queuing systems vary greatly between manufacturers, but in general queue sizes are limited to about 8 to 64Kbits, which results in proper TCP flow control.

Inefficient traffic flows

In a properly tuned network, congestion results in packet loss. TCP flows will respond to packet loss by backing off transmission rates. However, packets that survive long after they have been given up for dead make a mess of TCP flows. As a network is driven close to a point of congestion, you may find that data is repeatedly retransmitted, and TCP windows will start to slide back and forth. The overall result is very inefficient traffic flows.

The "Excessive packet round-trip time (RTT) detected" diagnostic indicates that your network is "queuing" (holding onto) packets for an unreasonable amount of time, and that this condition may be affecting the efficiency of your traffic flows. One of three network situations can lead to this condition: media errors or router hangs where queues are stopping, or excessive queue sizes where packets must pass through queues that are too big.

Diagram showing four routers connected sequentially. The second router has a stopped queue resulting in intermittent media errors.

Stopped queues

Normally once a packet is in a queue, the queue moves at a steady rate as the queue gets serviced. However, a downstream intermittent media error can cause a queue to stop, causing packets to delay. In the above scenario, Delivery monitoring tests to hop 1 and 2 will have normal results. However, testing to hop 3 or hop 4 will cause packets to pass through the downstream media errors. If this is the case, "Excessive packet round-trip time (RTT) detected" may get triggered.

Less commonly queues are stopped because of device failures. Sometimes a reboot of a router or switch is in order. In the above example, you would reboot hop 2.

Queues that are too big

This diagnostic can be triggered if queues are simply too big. For example, some old switches contained huge queues as illustrated below.

Diagram showing three routers connected sequentially. The second router has a very large filled queue resulting in a long delay for packets going through it.

Similarly, huge queues can be encountered if too many queues are used in series. For example, Frame Relay Vendor A has 4 switches in its cloud connection between New York and Chicago, while Frame Relay Vendor B may use 80 switches. Vendor B’s network may exhibit "Excessive packet round-trip time (RTT) detected" during busy periods, as illustrated below.

Diagram showing many routers connected sequentially. Delays are caused by too many hops between the source and destination.

In conclusion

Consider the fact that TCP must recover from data that is damaged, lost, duplicated, or delivered out of order across the path. The least efficient of these is duplicated data. "Excessive packet round-trip time (RTT) detected" is a diagnostic that indicates that your network could be duplicating network traffic during busy periods, which in turn can lead to a traffic snowball effect. By reducing queue time (and thereby reducing trip time), packet loss will occur during periods of congestion, and thereby traffic duplication is avoided. Simply said, by dropping packets sooner, TCP flows will work more efficiently.