Using MRTG to Monitor and Graph Traffic Loads
[LEFT][CODE]http://www.cisco.com/en/US/docs/ios/internetwrk_solutions_guides/splob/guides/dial/dial_nms/mrtg.html[/CODE][B]
[SIZE=3]Task 3—Using MRTG to Monitor and Graph Traffic Loads [/SIZE][/B][SIZE=3]
[/SIZE]
[B] About MRTG [/B]
Multi Router Traffic Grapher (MRTG) is a free performance management application for Unix; it monitors SNMP statistics from any SNMP-capable device on your network and:
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Captures, stores, and graphically presents SNMP data. By default, a web page with four graphs per MIB object (OID) is created by MRTG. The graphs show the variation of MIB data over time.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Runs from the crontab. Every five minutes, a cron job runs MRTG to query a user-configured list of OIDs and network devices. After each data collection cycle, the MRTG perl script posts updated graphs to a web page.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Efficiently compresses and archives data samples to create graphs.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Enables you to determine if trending data is useful for monitoring your environment before you invest in costly network performance software. If trending data is critical to manage your network, it may be necessary to purchase a commercial network performance package, such as Concord Network Health. However, you may find that MRTG is all you need.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Is available from [URL="http://ee-staff.ethz.ch/%7Eoetiker/webtools/mrtg/mrtg.html"]This Page Has Moved[/URL]
Figure 10
[LEFT][IMG]http://www.cisco.com/en/US/i/000001-100000/35001-40000/35001-35500/35193.jpg[/IMG][/LEFT]
MRTG Polls for OIDs; OID Values that Are Returned to MRTG
For each OID referenced in the configuration file, MRTG creates the following graphs:
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG][B]Daily graph[/B]—5 minute average data points with approximately 33 hours of data presented.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG][B]Weekly graph[/B]—30 minute average data points with approximately 8 days of data presented.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG][B]Monthly graph[/B]—2 hour average data points with approximately 5 weeks of data presented.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG][B]Yearly graph[/B]—1 day average data points with approximately 1 year of data presented.
To quickly create images by using the GD graphics library, go to [URL="http://www.boutell.com/gd"]GD Graphics Library[/URL]
[B] About Selecting Dial OIDs [/B]
To select which dial OIDs to query when monitoring dial-up activity, see the OIDs listed in the following tables:
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Circuit utilization OIDs ([URL="http://www.cisco.com/en/US/docs/ios/internetwrk_solutions_guides/splob/guides/dial/dial_nms/mrtg.html#wp1038866"]Table 14[/URL])
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Modem information OIDs ([URL="http://www.cisco.com/en/US/docs/ios/internetwrk_solutions_guides/splob/guides/dial/dial_nms/mrtg.html#wp1038926"]Table 15[/URL])
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]User information OIDs ([URL="http://www.cisco.com/en/US/docs/ios/internetwrk_solutions_guides/splob/guides/dial/dial_nms/mrtg.html#wp1038998"]Table 16[/URL])
[B]Caution [/B][IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Be cautious when polling network elements. Polling OIDs that retrieve large amounts of data can cause CPU problems on a Cisco IOS device. For example, do not get the ARP table, walk large portions of a MIB tree, poll the wrong OID too frequently, or get statistics that have an entry for every interface. For example, a Cisco 7200 may have 10 interfaces; whereas, a Cisco AS5800 may have 3,000 interfaces.
In this case study, the tools UCD-SNMP and SNMP Commander were used to inspect and understand the MIBs. Based on this research, the network engineers at THEnet identified the OIDs in the following tables to program in to MRTG.
To see the complete structure of the CISCO-POP-MGMT-MIB and CISCO-MODEM-MGMT-MIB, See the [URL="http://www.cisco.com/en/US/docs/ios/internetwrk_solutions_guides/splob/guides/dial/dial_nms/mrtg.html#wp1037978"]Appendix[/URL] at the end of this document. For further MIB information, refer to the following links:
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]For descriptions of supported MIBs and how to use MIBs, see the Cisco MIB web site on Cisco.com at the following URL:
[URL="http://www.cisco.com/public/sw-center/netmgmt/cmtk/mibs.shtml"]Network Management Software[/URL].
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]To obtain lists of MIBs supported by platform and Cisco IOS release and to download MIB modules, go to the Cisco MIB web site on Cisco.com at the following URL:
[URL="http://www.cisco.com/public/sw-center/netmgmt/cmtk/mibs.shtml"]Network Management Software[/URL].
[LEFT] Table 14 Circuit Utilization OIDs
[COLOR=Black][B][I]Variable[/I][/B][/COLOR]
Base MIB and OID
Description
Analog calls
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.2
The number of analog calls connected.
Active DS0s
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.4
The total number of calls connected.
Call count
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.1.1.7
The number of calls that have occupied a specific DS0.
Time in use
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.1.1.8
The time for each DS0.
PPP calls
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.5
The number of active PPP calls.
DS0 high water mark
CISCO-POP-MGMT-MIB
1.3.6.1.4.1.9.10.19.1.1.8
The maximum number of DS0s ever used simultaneously.
[/LEFT]
[LEFT] Table 15 Modem Information OIDs
[COLOR=Black][B][I]Variable[/I][/B][/COLOR]
Base MIB and OID
Description
Modems available
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.1.7
The number of modems currently available to take calls.
Average call duration
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.1.1.9
The average call duration for each modem in the NAS.
No answers
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.3.1.1
The number of calls not answered by a modem.
Failed Train
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.3.1.2
The number of modem calls that failed to train up.
It's normal behavior for most modems to not have a 100 percent success rate.
Successful train
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.3.1.3
The number of modem calls that successfully trained up.
It's normal for most modems to not have a 100 percent success rate.
TX speed
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.1.1.14
The current transmit speed (TX) of all the modems in the NAS.
If a modem does not have an active call, zero is returned.
RX speed
CISCO-MODEM-MGMT-MIB
1.3.6.1.4.1.9.9.47.1.3.1.1.15
The current receive speed (RX) of all the modems in the NAS.
If a modem does not have an active call, zero is returned.
[/LEFT]
[LEFT] Table 16 User Information OIDs
[COLOR=Black][B][I]Variable[/I][/B][/COLOR]
[COLOR=Black][B][I]Base MIB and OID[/I][/B][/COLOR]
Description
Active user ID
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.3
List of users currently connected and authenticated.
Active call duration
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.8
Call durations for currently connected and authenticated users.
User CLID
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.2
List of user Caller IDs (CLID).
DNIS phone number
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.13
List of called Dialed Number Information Service (DNIS) phone numbers.
Active TTY
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.14
List of asynchronous terminal lines (TTY) in use.
Active modem slot
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.6
List of which user is using which modem slot.
Active modem port
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.7
List of which user is using which modem port.
Active user IP
CISCO-MODEM-MGMT-MIB
.1.3.6.1.4.1.9.10.19.1.3.1.1.4
List of which IP addresses are currently in use.
[/LEFT]
[B] How to Inspect and Interpret Data [/B]
Internet users spend approximately 80 percent of their time reading information—not downloading data. Modem traffic is very limited on a per user basis. People cannot read as fast as modems can download. Therefore, watch for the following types of trends and performance data on the access servers:
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]PPP sessions in use.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]DS0s in use.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Modem calls that have been rejected.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]The number of calls coming in to the access server and at what time.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Spikes or dips in total calls connected outside the normal call pattern.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Long-term trends that may mean that you need to upgrade components in your network.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]Throughput that has been reduced to unacceptable levels (potential bottlenecks).
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]For disaster recovery purposes, when fail over events and routing swaps occur, look for drops in the primary data path and jumps in the backup path.
•[IMG]http://www.cisco.com/en/US/i/templates/blank.gif[/IMG]The utilization of the IP backbone, such as a Frame Relay link or Ethernet campus.
The Connection Success Rate (CSR) is an important metric for tracking and measuring the stability of a dial service. The CSR is defined by the number of modems that successfully train up and go in to connected state. In addition to the CSR, you must track and analyze additional areas. For example, SNMP MIBs can be used to measure the success rate for items such as PPP, AAA, and IP negotiation.
To collect the CSR service level counters, inspect the connection success and failure rate by using modem OIDs or the [B]show modem[/B] Cisco IOS command. SNMP, rather than the Cisco IOS CLI, is the preferred method to collect these counters. SNMP can scale to support large numbers of access servers.
The following graphs show the DS0s and PPP sessions in use for 70,000 modem users calling in to a dial-up service at a large university. The graphs are taken from one Cisco AS5300 in a large dial-up modem pool.
Figure 11
[LEFT][IMG]http://www.cisco.com/en/US/i/000001-100000/35001-40000/35001-35500/35106.jpg[/IMG][/LEFT]
Daily Graph: DS0s and PPP Sessions in Use
The jagged saw-tooth pattern at the top of the graph indicates a telephone-switch hunt group for the dial lines passing by the access servers. A "jump up" occurs each time the hunt group passes by a different T1 line. For a hunt group that rotates in a round-robin fashion, a jagged saw-tooth pattern is normal.
Figure 12
[LEFT][IMG]http://www.cisco.com/en/US/i/000001-100000/35001-40000/35001-35500/35107.jpg[/IMG][/LEFT]
Weekly Graph: DS0s and PPP Sessions in Use
Figure 13
[LEFT][IMG]http://www.cisco.com/en/US/i/000001-100000/35001-40000/35001-35500/35108.jpg[/IMG][/LEFT]
Monthly Graph: DS0s and PPP Sessions in Use
MRTG efficiently compresses and archives data to create graphs. For example, you can keep information for an entire year on a server without using much disk space.
Figure 14
[LEFT][IMG]http://www.cisco.com/en/US/i/000001-100000/35001-40000/35001-35500/35109.jpg[/IMG][/LEFT]
Yearly Graph: DS0s and PPP Sessions in Use [/LEFT]