There are a number of issues that can occur on a network path. This page describes some of those issues, things to consider, and the things that can be done about them.

APM and ISP capacity numbers differ

There are times when the network capacity numbers returned by APM do not match those from a speed test provided by your ISP. If total capacity measurements from APM are greater than what you expect, this is usually because the link to your ISP is physically capable of greater speed, but your ISP has used a traffic engineering technique called ‘rate limiting’ to limit your bandwidth to the amount specified in your Service Level Agreement. Usually transactional data and control data is allowed through at full capacity because they are short bursts of traffic, but sustained data transfers like streaming media will trigger the rate limiter.

Because Continuous Path Analysis testing used by Delivery monitoring is extremely light weight, it too might not be able trigger the rate limiter. As a result, you’ll end up seeing the entire capacity of the link, rather than the amount that has been provisioned for you by your ISP. To confirm this, try the following:

  1. Confirm that the speed test run by the ISP is effectively using the same source and target as your test.
  2. Use dual-ended monitoring (testing a path between two AppNeta monitoring points). Dual-ended monitoring measures network capacity in both directions (source to target and target to source), similar to speed tests. Testing each direction independently allows you to account for asymmetry in the network path. For example, upload and download rates may be different and may take different routes. Single-ended monitoring can only determine the capacity in the direction with the lowest capacity.
  3. Run PathTest. PathTest does not use lightweight packet dispersion, but rather generates bursts of packets which may trigger carrier shaping technologies. For this test, set up PathTest as follows:

    1. In APM, navigate to Delivery > Path Plus
    2. In the PathTest Settings pane:
      1. Set Protocol to UDP
        • UDP and ICMP packets are treated differently by network equipment. UDP packets are treated as data traffic whereas ICMP packets are treated as control traffic.
      2. Set Direction to Both (Sequential).
      3. Set Duration as appropriate (default 5 seconds).
      4. Set Bandwidth to Max.
      5. Click Run Test.
  4. For cases where you need to measure capacity over time, you can use rate-limited monitoring. Rate-limited monitoring is similar to PathTest in that it loads the network while testing, but instead of a single measurement, it makes measurements at regular intervals over time. Contact AppNeta Support to enable rate-limited monitoring.

Capacity lower than expected

If the capacity of a network path measured by APM is lower than you expect, there are some things to keep in mind and some things you can do to verify the measurement.

Capacity and bandwidth are different

Bandwidth is the transmission rate of the physical media link between your site and your ISP. The bandwidth number is what the ISP quotes you. Capacity is the end-to-end network layer measurement of a network path - from a source to a target. Link-layer headers and framing overhead reduces rated capacity to a theoretical maximum. This maximum is different for every network technology. Further reducing capacity is the fact that NICs, routers, and switches are sometimes unable to saturate the network path, and therefore the theoretical maximum can’t be achieved. ‘Saturate’ means the ability to transmit packets at line rate without any gaps between them. All switches can run at line rate for the length of time that a packet is being sent but some are unable to send the next packet without any rest in between. This determines the ‘switch capacity’. APM provides a range for Total Capacity that you can expect given the physical medium and modern equipment with good switching capacity.

The following table shows the expected capacity of various link types:

Standard Standard link speed L1 + L2 overhead Theoretical total capacity Optimal total capacity
DS0 or ISDN 64 Kbps 3.9% 61.5 Kbps 61.5 Kbps
ISDN dual channel 128 Kbps 3.9% 123 Kbps 123 Kbps
T1 (HDLC+ATM) 1.544 Mbps 11.6% 1.365 Mbps 1.325-1.375 Mbps
T1 (HDLC) 1.544 Mbps 3.5% 1.49 Mbps 1.40-1.49 Mbps
E1 2.0 Mbps 3.5% 1.93 Mbps 1.86-1.95 Mbps
T3 45 Mbps 3.5% 43.425 Mbps 42.50-43.45 Mbps
10M Ethernet half-duplex 10 Mbps 2.5% 4.875 Mbps 4.8-4.9 Mbps
10M Ethernet full-duplex 10 Mbps 2.5% 9.75 Mbps 9.7-9.8 Mbps
100M Ethernet half-duplex 100 Mbps 2.5% 48.75 Mbps 48.5-49.0 Mbps
100M Ethernet Full-duplex 100 Mbps 2.5% 97.5 Mbps 90-97.5 Mbps
Gigabit Ethernet Full-duplex 1 Gbps 2.5% 975 Mbps 600-900 Mbps

Note: Total capacity is based on the assumption that traffic will flow in both directions. Therefore, you can expect the total capacity for half-duplex links to be roughly half of what it would be with full-duplex.

Consider the target

Some devices make better targets than others. Choosing a good target is important in order to get good measurements.

Asymmetric links, if measured using single-ended paths, will show the capacity of the slowest of the uplink and downlink directions. This can be misleading. Measuring a link using a dual-ended path will show the capacity of each direction. If you are unsure whether you have an asymmetric link, setting up a dual-ended path (for example, to an AppNeta WAN Target) will allow you to determine this.

Persistent low capacity condition

When a low capacity condition is persistent rather than transient, it is caused by a network bottleneck, not by congestion. The bottleneck can be at any point on the path between the source and the target. To determine if the link to your ISP is the bottleneck, create an additional network path to an AppNeta WAN target. Using Route Analysis, confirm that the only common part of the two paths is to the ISP. If this is the case, and the capacity measurements are the same, then the bottleneck is likely the link to your ISP. Otherwise, the bottleneck is somewhere else on the path.

Capacity chart shows no capacity

This can be due to sustained packet loss. See Packet loss is present.

Packet loss is present

Capacity is measured by sending multiple bursts of back-to-back packets every minute (as described in TruPath). To measure total capacity, at least one burst must come back with zero packet loss. If that is not the case, then the capacity measurement is skipped for that interval. If packet loss is intermittent, the result is a choppy Capacity chart. If packet loss is sustained, the Capacity chart will show no capacity while the packet loss is present.

Confirm with PathTest

If none of the previous subsections is applicable to your situation, you can use PathTest to corroborate the low capacity measurements. Remember that this is a load test and it measures bandwidth, not capacity.

  • If the PathTest result supports the capacity measurements, it is possible that you’re not getting the proper provisioning from your ISP.
  • If the PathTest result does not support the capacity measurements, contact AppNeta Support so we can help you investigate further.

Sustained packet loss

If a path shows sustained packet loss, review its latest diagnostic results to understand where the loss is occurring:

  • If the loss is occurring at the last hop, make sure that firewall/endpoint protection at the target allows ICMP.
  • If the loss is occurring mid-path, make sure routing policies are not de-prioritizing ICMP, and access control lists are not blocking ICMP.

You can also try looking at other diagnostics on the same path to look for consistency in the results - identifying the same hop as a problem. Another option is to look at the diagnostics of other paths that use the same hop.

As issues with ICMP traffic may not be present in TCP or UDP traffic, you can set up a dual-ended path to test whether the other protocols are affected in the same way.

Oversubscription

Oversubscription is a technique your ISP uses to sell the full bandwidth of a link to multiple customers. It’s a common practice and is usually not problematic, but if it is impacting performance, you’ll see it first in your utilized capacity measurements on the Capacity chart.

The first thing you want to do is corroborate capacity measurements with round-trip time, loss, and jitter. If there are no corresponding anomalies, then whatever triggered the high utilization isn’t really impacting performance. If there are, you’ll then use Usage monitoring to check for an increase in network utilization.

High utilized capacity coupled with no increase in flow data is a classic sign of oversubscription. You should contact your ISP if this is the case.

Bottleneck vs. Congestion Point

Often we see the terms network bottleneck and network congestion point used interchangeably. We believe that there is a distinction between these terms. A network bottleneck is the slowest point on a network path. Every network path has a bottleneck (for example, a low speed link). If the performance of the bottleneck is improved, the bottleneck moves to another point in the path. A congestion point, on the other hand, is the point in a network path where production traffic is backing up. Often there is congestion at the bottleneck, but not always. There may be several congestion points on a network. A congestion point is usually a transient condition where a bottleneck is not.

Mid-path device shows lower capacity than the target

When a monitoring point generates diagnostic tests on a network path, it targets each hop on the path, one at a time, to help determine whether any of the hops is performing poorly. One of the metrics that’s calculated for each hop (and displayed on the Data Details tab of the Diagnostics page) is the Total Capacity. One would expect that the total end-to-end capacity should be no greater than that of the lowest capacity hop. This is true, the lowest capacity device is the bottleneck in the path, but often you will see hops showing a lower capacity number than the end-to end capacity. The reason for this is to do with router architecture. Traffic passing through the router is given higher priority than that destined for the router itself. In other words, the capacity numbers for the intermediate hops may not actually represent the true forwarding capacity of the device.

Determining how long a network path was in violation

To determine how long a network path was in a violation state you need to find the violation event and the corresponding clear event. You can do this in a number of ways. Some common methods are described below:

To determine how long a network path was in violation using the Events chart:

  1. Navigate to Delivery > Network Paths.
  2. Click the network path you are interested in.
  3. Click the Performance tab.
    • The network path performance page is displayed.
  4. Select the time period you are interested in.
  5. On the Events chart, hover over the violation event then the corresponding clear event to see the times they occurred.
    • The time between the two events is the time the path was in violation.

To determine how long a network path was in violation using the Events tab:

  1. Navigate to Delivery > Network Paths.
  2. Click the network path you are interested in.
  3. Click the Events tab.
    • The network path events page is displayed.
  4. Click the Event Time header to sort by event time.
  5. Find the violation and clear events you are interested in.
    • The time between the two events is the time the path was in violation.

Note: This method only allows you to look back up to 7 days.

Finally, if email notifications have been configured, you can review your email for the violation and clear events you are interested in. The time between the two events is the time the path was in violation.