Interpreting an MTR Trace

Note:  This is a very simplified process for MTR trace analysis that covers most common issues and misconceptions.  We ask that you provide full MTR trace data in response to any support ticket in which is is requested -- self diagnosis using this article is a great first step, but further information can be derived from the data further down the line.

 

Issues on the 1st hop:  Keep in mind that more often than not, issues appearing on the first hop are caused by link saturation.  If the site has an ADSL2 connection that's syncing at 11Mbps and that capacity is being used, this may manifest as high or fluctuation latency, or packet loss on the connection.  Be sure to check the modem / router's throughput statistics before investigating further into other areas.

 

 

Step 1 - Check for flow-on packet loss

 

Packet loss can appear in MTR results even when a connection is perfectly healthy - this is particularly common on carrier hops.  Routers on the internet typically deal with anything up to millions of packets of traffic per second, meaning that on occasion they need to prioritize how they are using their resources.  ICMP packets directed at the router itself (such as the ones used in an MTR trace) quite often are treated with the absolute lowest priority, so although the router may be doing it's job just fine - it may drop direct ICMP requests, which an MTR trace picks up as packet loss.

 

Consider the following examples...

 

Fig 1.1 - A healthy example showing packet loss

Host

Loss %

Snt

Last

Avg

Best

Wrst

Localrouter.localdomain

0.0%

100

0.5

0.4

0.5

0.6

Isplns.someisp

0.0%

100

40

45

40

50

Carrierhop1.carrier

0.0%

100

42

45

40

55

Carrierhop2.carrier

60.00%

100

44

45

40

55

Neuralrouter1.neural.net.au

0.0%

100

46

45

40

55

Awesomeserver.neural.net.au

0.0%

100

48

45

40

55

 

In this example, the host Carrierhop2.carrier is showing 60% packet loss, however all subsequent hops show 0% loss.  In this instance, the connection is happy and healthy and no action is required.

 

A connection that does in fact have a faulty hop, will look more like this...

 

Fig 1.2 - A connection with a real faulty hop

 

Host

Loss %

Snt

Last

Avg

Best

Wrst

Localrouter.localdomain

0.0%

100

0.5

0.4

0.5

0.6

Isplns.someisp

0.0%

100

40

45

40

50

Carrierhop1.carrier

0.0%

100

42

45

40

55

Carrierhop2.carrier

60.00%

100

44

45

40

55

Neuralrouter1.neural.net.au

65.0%

100

46

45

40

55

Awesomeserver.neural.net.au

65.0%

100

48

45

40

55

 

Notice in this example, that carrierhop2.carrier shows loss, as does everything after it, including the last hop.  By far the most important loss figure to pay attention to is the last hop -- even if multiple prior hops show packet loss, but the last one doesn't - then there is probably nothing wrong with packet loss on the connection.

 

 

 

 

 

Step 2 - Check for latency fluctuations

 

Even on connections with no packet loss - latency fluctuations can cause interruptions to service, particularly on latency sensitive services such as VoIP and faxing.

Although an average latency figure may be a healthy 40 - 80ms, there may be peaks that are more difficult to detect using a standard ping.  This is where the data from an MTR trace can be invaluable in identifying a periodic latency fluctuation issue.

Consider the following examples...

 

 

Fig 2.1 - Fluctuation in latency (first-hop connection or contention issue)

Host

Loss %

Snt

Last

Avg

Best

Wrst

Localrouter.localdomain

0.0%

100

0.5

0.4

0.5

0.6

Isplns.someisp

0.0%

100

40

45

40

700

Carrierhop1.carrier

0.0%

100

42

45

40

800

Carrierhop2.carrier

0.0%

100

44

45

40

800

Neuralrouter1.neural.net.au

0.0%

100

46

45

40

800

Awesomeserver.neural.net.au

0.0%

100

48

45

40

800

 

In this example, all hops beyond the first hop (the first hop being the router accessed via LAN) have high values for "Wrst" (Worst / Highest latency).  This indicates that the latency on the internet connection at the site is spiking.  Regardless of the average, best and last values - if the worst value is consistently coming up high across multiple MTR's, there may be an issue to look in to on the connection at the site itself.  

 

 

Fig 2.2 - Fluctuation in latency (ISP, carrier, or Neural issue)

Host

Loss %

Snt

Last

Avg

Best

Wrst

Localrouter.localdomain

0.0%

100

0.5

0.4

0.5

0.6

Isplns.someisp

0.0%

100

40

45

40

50

Carrierhop1.carrier

0.0%

100

42

45

40

55

Carrierhop2.carrier

0.0%

100

44

45

40

550

Neuralrouter1.neural.net.au

0.0%

100

46

45

40

550

Awesomeserver.neural.net.au

0.0%

100

48

45

40

550

 

In this example, only hops further away from the site connection are showing high latency worst values.  In this instance you'll need to escalate the issue to the first hop showing the issue.  For example, if the hop belongs to your ISP, they will need to be advised - however if it's further on (a carrier or on the Control Networks' network) then you will need to advise us via our support channels.

 

 

These are the most common issues that can be identified with an MTR trace and this is by no means an extensive list, nor is it accurate in all occurrences.  If in doubt, or if requested by our support team, please be sure to send the MTR trace through in reply to your support ticket, or open a new one via ithelpdesk@controlnetworks.com.au

  • 2 Users Found This Useful
Was this answer helpful?

Related Articles

Mikrotik Router Factory Reset

To factory reset a MikroTik RouterBoard, completely erasing all configuration and resetting...

Running a WinMTR Trace

An MTR trace provides detailed routing, latency and packet loss statistics between endpoints and...