Network Monitoring

Cornell Network Research Group - Cornell Computer Science

Index:

Motivation for this research:

Currently, the network management tools that exist today do not adequately monitor network components. There are a lot of events that can occur that can throw havoc into a network. For example, on election day, lots of web surfers go to election results sites causing unusual traffic flows through the network. This event is a one time event that could bring a network to its knees. The goal of this project is to monitor networks and detect when various components are not running normally and alerting the network administrator of this fact. It is unreasonable to expect a network administrator to check every node on a network so it is advantageous if the network administrator is alerted to nodes acting abnormally.

This work is going to be done by using scripts to log various network statistics and develop heuristics to define network stability. Eventually, this work will be tied into the work being done on automatic topology discovery to create tools for a next-generation network management system.

(Jump to the index)

Design goals and implementation:

The network monitoring tools were designed with portability and modularity in mind. The scripts used are written in PERL and can run under UNIX and WinNT. The data gathering script is separate from the presentation scripts allowing for different views of the data being added easily. It allows for the tools to be used in conjuction with simulators, such as Real and end nodes on the internet which are heterogenous. This modularity allows for easy expansion in the future.

The script that gathers the data uses SNMP to query nodes on a network. Any type of node on a network can be used as long as it supports SNMP. Nodes that have been used in this project include computers running Windows NT Server and routers. The data gathering script reads in a list of nodes to monitor and then for each node, an input file of MIB entries to be monitored for that node is read in. The script then queries the node using SNMP for each of the MIB entries. The data is then all stored in a file using a pre-defined format. The file is named in the format: switching element-monthdayyear-hourmin.dat. Initially, the nodes are queried at a predetermined interveral. Currently, the default interval is 5 minutes.

Once the data has been collected, other scripts are run to parse and analyze the data. The scripts go through the data and for each hour, compute the min, the max, the mean, and the variance for a given MIB entry. If the script determines that the mean for a given hour varies from the average value over time by a given threshold (currently twice the standard deviation), then a warning message is logged in the log file and the script changes the interval in which the node is monitored.

There are other scripts that parse the data files. Currently, data can be converted to HTML or gnuplot graphs stored as gif's.

(Jump to the index)

Data Gathered:

Below are HTML daily summaries of the in and out load measured as the number of bytes coming in and out of a router. The routers that have been surveyed are along the computer science's route in the Cornell network to the Cornell gateway.

CSGate4.cs.cornell.edu

hol1-mss.cit.cornell.edu

core2-mss.cit.cornell.edu

upson2-np.cit.cornell.edu

cornellnet1.cit.cornell.edu

cornellnet2.cit.cornell.edu

The directory (\\thelonious\wwwroot\cnrg\netmon\old-data) contains tar'ed and gzipped data files since March.

(Jump to the index)

Old Data Gathered:

The graphs below show various statistics plotted for different routers within the computer science department of Cornell and within the Cornell network. The graphs show data that was polled over 5 minute periods using SNMP early on in this project. These graphs were collected while the graphing tools were being created. The tools could still make these type of graphs but the current work above shows summaries of the bytes coming in and out of routers.

The router csgate2.cs.cornell.edu:

Number of bytes transmitted: (2/19-23/98)

Number of bytes received: (2/19-23/98)

Number of UDP datagrams received: (2/19-23/98)

Number of UDP datagrams transmitted: (2/19-23/98)

Number of incoming TCP segments: (2/19-23/98)

Number of outgoing TCP segments: (2/19-23/98)

The router csgate4.cs.cornell.edu:

Number of bytes transmitted: (2/7-12/98), (2/27-3/2/98), (3/5-9/98)

Number of bytes received: (2/7-12/98), (2/27-3/2/98), (3/5-9/98)

Number of UDP datagrams received: (2/27-3/2/98), (3/5-9/98)

Number of UDP datagrams transmitted: (2/27-3/2/98), (3/5-9/98)

The router core2-mss.cit.cornell.edu:

Number of bytes transmitted: (3/4-9/98)

Number of bytes received: (3/4-9/98)

Number of unicast packets received: (3/4-9/98)

Number of unicast packets transmitted: (3/4-9/98)

Number of incoming UDP packets: (3/4-9/98)

Number of outgoing UDP packets: (3/4-9/98)

The router hol1-mss.cit.cornell.edu:

Number of bytes transmitted: (3/4-9/98)

Number of bytes received: (3/4-9/98)

Number of unicast packets received: (3/4-9/98)

Number of unicast packets transmitted: (3/4-9/98)

Number of incoming UDP packets: (3/4-9/98)

Number of outcoming UDP packets: (3/4-9/98)

(Jump to the index)

Conclusions reached from this work:

Traffic patterns on most routers are periodic where traffic drops off after midnight and then picks up again in the morning with a drop off for lunch and then dinner.

When traffic gets heavy on a router, most of the SNMP requests are dropped making it difficult to determine any problems with the router. When this occurs, other means of monitoring a network node are needed.

(Jump to the index)

Powerpoint slides and other non-HTML related work:

Some of the documents here were prepared for non-web purposes. Because of the move to the web, some of the documents might be viewable within the browser. This is true with powerpoint slides and Internet Explorer.

Installation Guide/Readme in Word Format

Powerpoint Slides used for 10 minute talk on May 6, 1998

Powerpoint Slides used in the Cornell CS fair

Visio Diagram of the CS Systems lab

Access Database of the switch connections in the CS Systems lab

(Jump to the index)

Interesting links:

Here are some links that have been helpful in doing this project:

Internet Traffic Report for MCI's backbone.

A paper titled, "Monitoring your network with freely available statistics reporting tools."

Cooperative Association For Internet Data Analysis.

The Perl Language Home Page.

(Jump to the index)

People working on this project:

This project is developed under the supervision of Dr. S. Keshav and Rosen Sharma. The people working in this project are:

Russell Schwager, russells@cs.cornell.edu

(Jump to the index)

Credits:

Russell Schwager
russells@cs.cornell.edu
May 16, 1998