Maximum Transmission Unit

One of the most powerful and yet subtle capabilities of Delivery monitoring is its ability to discover Maximum Transmission Unit (MTU) conditions and problems on a link. MTU misalignments can result in network idiosyncrasies and degradations that are extremely difficult to diagnose. Delivery monitoring checks for several significant MTU error conditions for each link within a network path. Passing these checks ensures that MTU works properly along the entire path.

What is MTU?

When Internet Protocol is used to transfer data across a path, data is encapsulated into packets before it leaves the physical interface. If the application is sending a small chunk of data, for example 500 bytes, it will usually be sent within a single packet. However, when the application must send a larger chunk of data, the data must be distributed over several packets. The MTU of the medium determines the maximum size of the packets that can be transmitted without fragmentation. Network links that have properly configured MTUs are more efficient. Typically the internet operates with an MTU of 1500 bytes, however other values are acceptable. Mixing different MTU values within one network path is also acceptable, provided that all components within a network path share similar rules regarding conflicting MTUs. As a general rule, fragmentation should be avoided at all costs.

To understand MTU, one must be very aware of the difference between frames and packets. Frames are generated by Layer-2 devices and encapsulate Layer-3 packets. The following diagram shows a 1500 byte TCP/IP packet passing through Ethernet. Notice that although Ethernet supports 1518 bytes frames, it is designed to carry at most 1500 byte packets; therefore, the MTU of Ethernet is 1500 bytes.

Diagram showing the breakdown of TCP/IP over Ethernet with the maximum supported frame size.

There are several RFCs that define MTU. One significant RFC is RFC 1191 that defines the Path MTU Discovery of the link. Other RFCs also apply, which is why the maximum packet size allowed on a network without fragmentation is based on the type of network connections involved.

RFC # Description MTU
894 Minimally required 68
1051 ARCNet 508
879, 1356 X.25, ISDN 576
1055 Serial Line IP (SLIP) 1006
1042, 2516 IEEE 802.3/802.2, PPPoE 1492
894, 895 Ethernet 1500
1390 FDDI 4352
1042 4 Mbit Token Ring 4464
1042 802.4 Token Bus 8166
IBM 16 Mbit Token Ring 17914
1374 HIPPI 65535

Why Use Large MTUs?

Increasing frame size gives better performance. To understand why, examine the following results from Advanced Test Engineering and Measurement. They are performance measurements taken across the Abilene and CA*net4 backbone using Delivery monitoring between various universities, at various MTU settings. Notice that theoretically, increasing MTU should have just a minor effect on network’s maximum performance. However, most high-speed networks do not work at full capacity. For these networks, increasing MTU has proven to greatly increase overall throughput.

Graph of GigE maximum achievable 2-way bandwidth (y-axis) versus frame size from Kansas City to various universities. In general, bandwidth increases as frame size increases but levels off short of the theoretical maximum.

Understanding MTU Conflicts

MTU negotiation errors are very difficult to detect and manifest themselves in subtle but destructive behaviors. When an MTU conflict exists, MTU negotiation fails and packets will be lost when they are too large to traverse the network. Due to the nature of MTU conflicts, traffic will fail in one direction but not in the other. What is often misunderstood is the fact that an MTU conflict typically results in slow network connections, not broken connections. Here are some real-life symptoms that were eventually attributed to MTU negotiation conflicts:

  • FTP upload of a large file takes a few minutes, but download takes hours, or even fails.
  • FTP downloads work only with small files.
  • Web pages load quickly, but .GIF files are very slow to load, and occasionally fail.
  • Email works, but discussion databases take forever to load.
  • Normal server to client traffic is quick, but client backups fail.

To understand why MTU conflicts often result in slow links, consider the following client-server TCP flow. It illustrates a typical environment where a Gigabit server configured with 9000 byte MTU is incorrectly connected via a Layer 2 switch to a 10/100 client employing a 1500 byte MTU. In this example, a black hole condition exists for traffic flowing from the server to the client. The server receives a request from the client, and a 9000 byte packet is generated. The Layer 2 switch accepts the packet, but drops it once it discovers that the packet is too large to send to the client. Since Layer 2 switches have no knowledge of Layer 3 content, they cannot inspect the DF bit in the IP header, nor can they generate a sufficient response to the server to explain why it dropped the packet. As far as the server is concerned, the packet was lost due to congestion.

TCP contains a slow-start congestion avoidance algorithm that shrinks the TCP transmit window to half its size when a packet has timed out. In the case of a black-hole hop, the result is a retransmitted packet that is half its previous size. In this example, 3 packets are lost before the TCP transmit window produces a 1125 byte packet. This packet does survive, which produces a response from the client and in turn keeps the connection alive. Eventually the connection develops a cycle of sending a 2250 byte packet that is lost, waiting a timeout period, and then sending a 1125 byte packet that is received. Overall, the connection does not die. Rather, it runs very slowly.

Diagram showing communications between a client and a server through a layer 2 switch over time trying various MTUs.

To avoid MTU conflicts, you must ensure that you deploy Layer 2 devices on MTU boundaries, and that you do not filter out ICMP messages. This will ensure that Path MTU discovery (PMTU) works as described in RFC 1191. The following diagram illustrates how PMTU discovery works.

Diagram showing communications between a client and a server through a router over time.

Most modern operating systems and applications are PMTU enabled, and thereby the "Don't Fragment" bit is set in all IP headers. Therefore, when the 9000 byte packet is received by the router, fragmentation is not attempted. Rather, the result is an ICMP "fragmentation required but DF set" message, also known as the "too big" message. When the server receives this ICMP message, it updates its routing table for the client with the MTU reported in the message, and will remember to send smaller packets to the client. Note that once the client's MTU has been discovered, the MTU is not renegotiated on subsequent connections.

In complex environments it is easy forget to install Layer 2 routers between MTU boundaries. To avoid problems, we suggest maintaining MTU values on logical diagrams outlining the rules for Layer 3 subnets. Doing so establishes rule sets for portions of networks regardless of how they are physically connected. Here is an example.

Diagram showing two systems connected via a Layer 2 router. One system has MTU set to 9000. The other system has MTU set to 1500.

Another common cause of black-hole hops is confusion between frame size and MTU. The following diagram illustrates a breakdown of a typical Ethernet frame.

Diagram showing the breakdown of an Ethernet frame and how they relate to the OSI layers.

Notice that at Layer 2, a typical Ethernet frame has a maximum size of 1518 bytes. But at Layer 3 we deal with packets, and in this example the MTU represents the maximum packet size of 1500 bytes. It is important to understand that the difference between packet size and Ethernet frame size is 18 bytes. Therefore, if you want to set up a GigE connection with a 9000 byte MTU, you must set the frame size of the NIC to 9018. When the cause of this condition is a reduced MTU at a destination hop, Maximum Segment Size (MSS) negotiation can protect TCP from failure. However, many Black-Hole Hops are caused by incorrectly configured mid-point layer-2 devices, in which case MSS negotiations are ineffective. Furthermore, in many scenarios MSS negotiation is ignored.

Avoiding MTU Conflicts

To avoid MTU problems, consider the following:

  • Use one common MTU per subnet
  • Separate different MTUs with Layer 3 routers. Avoid Layer 2 FDDI/Token Ring/Ethernet bridges.
  • Do not filter ICMP packets.
  • Use reputable VPN solutions.
  • Test network paths from high MTU to low MTU (appliance on large MTU end)
  • Maintain logical diagrams to establish rule sets for MTUs, and to ensure that routers separate MTU domains.
  • Avoid adjusting MTU of clients to compensate for network issues.
  • Ensure server and network personnel have established common MTU policies. </ul>

    The Delivery monitoring Approach

    MTU negotiation is an important part of the overall health of a network. During a test, Delivery monitoring will report the following MTU conditions detected along the test path:
    • The (apparent measurable) PMTU
    • Nonstandard MTU's in use
    • Standard MTU's in use, other than 1500 bytes (Ethernet)
    • "Black-hole" hop, where a router fails to send the "fragmentation needed and DF set" ICMP message. In other words, it is unable to properly participate in MTU negotiation.
    • "Gray-hole" hop, where a router returns the wrong MTU value in the "fragmentation needed and DF set" ICMP message. In other words, it is responding with an incorrect MTU value for the constricting hop.
    • MTU conflicts, where the network is exhibiting behavior, including packet loss, that corresponds to an MTU conflict with one or more devices on the network path. </ul>

      Other resources

      RFC 791 RFC 792 RFC 1191