5.5. Analysis Tools
As
previously noted, one reason for using 
tcpdump
is the wide variety of support tools that are available for use with
tcpdump or files created with
tcpdump. There are tools for sanitizing the
data, tools for reformatting the data, and tools for presenting and
analyzing the data.
5.5.1. sanitize
If you are particularly sensitive to
privacy or security concerns, you may want to consider
sanitize, a collection of five Bourne shell
scripts that reduce or condense 
tcpdump trace
files and eliminate confidential information. The scripts renumber
host entries and select classes of packets, eliminating all others.
This has two primary uses. First, it reduces the size of the files
you must deal with, hopefully focusing your attention on a subset of
the original traffic that still contains the traffic of interest.
Second, it gives you data that can be distributed or made public (for
debugging or network analysis) without compromising individual
privacy or revealing too much specific information about your
network. Clearly, these scripts won't be useful for everyone.
But if internal policies constrain what you can reveal, these scripts
are worth looking into.
The five scripts included in
sanitize are 
sanitize-tcp,
sanitize-syn-fin,
sanitize-udp,
sanitize-encap, and
sanitize-other. Each script filters out
inappropriate traffic and reduces the remaining traffic. For example,
all non-TCP packets are removed by 
sanitize-tcp
and the remaining TCP traffic is reduced to six fields -- an
unformatted timestamp, a renumbered source address, a renumbered
destination address, the source port, a destination address, and the
number of data bytes in the packet.
934303014.772066 205.153.63.30.1174 > 205.153.63.238.23: . ack 3259091394 win 8647 (DF)
                         4500 0028 b30c 4000 8006 2d84 cd99 3f1e
                         cd99 3fee 0496 0017 00ff f9b3 c241 c9c2
                         5010 21c7 e869 0000 0000 0000 0000
would be reduced to 
934303014.772066 1 2 1174 23
0. Notice that the IP numbers have been replaced with
1 and 
2, respectively. This
will be done in a consistent manner with multiple packets so you will
still be able to compare addresses within a single trace. The actual
data reported varies from script to script. Here is an example of the
syntax:
bsd1# sanitize-tcp tracefile
This runs 
sanitize-tcp over the
tcpdump trace file
tracefile. There are no arguments.
 
5.5.2. tcpdpriv
The
program 
tcpdpriv is another program for removing
sensitive information from 
tcpdump files. There
are several major differences between 
tcpdpriv
and 
sanitize. First, as a shell script,
sanitize should run on almost any Unix system.
As a compiled program, this is not true of
tcpdpriv. On the other hand,
tcpdpriv supports the direct capture of data as
well as the analysis of existing files. The captured packets are
written as a 
tcpdump file, which can be
subsequently processed.
Also,
tcpdpriv allows you some degree of control over
how much of the original data is removed or scrambled. For example,
it is possible to have an IP address scrambled but retain its class
designation. If the 
-C4 option is chosen, an IP
address such as 
205.153.63.238 might be replaced
with 
193.0.0.2. Notice that address classes are
preserved -- a class C address is replaced with a class C address.
There are a variety of command-line options that control how data is
rewritten, several of which are mandatory. Many of the command-line
options will look familiar to 
tcpdump users. The
program does not allow output to be written to a terminal, so it must
be written directly to a file or redirected. While a useful program,
the number of required command-line options can be annoying. There is
some concern that if the options are not selected properly, it may be
possible to reconstruct the original data from the scrambled data. In
practice, this should be a minor concern.
As an example of using 
tcpdpriv, the following
command will scramble the file 
tracefile:
bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w outfile
The 
-P99
option preserves (doesn't scramble) the port numbers,
-C4 preserves the class identity of the IP
addresses, and 
-M20 preserves multicast
addresses. If you want the data output to your terminal, you can pipe
the output to 
tcpdump:
bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w- | tcpdump -r-
The last options look a little strange, but they will work.
 
5.5.3. tcpflow
Another useful tool is
tcpflow, written by Jeremy Elson. This program
allows you to capture individual TCP flows or sessions. If the
traffic you are looking at includes, say, three different Telnet
sessions, 
tcpflow will separate the traffic into
three different files so you can examine each individually. The
program can reconstruct data streams regardless of out-of-order
packets or retransmissions but does not understand fragmentation.
tcpflow
stores each flow in a separate file with names based on the source
and destination addresses and ports. For example, SSH traffic (port
22) between 
172.16.2.210 and
205.153.63.30 might have the filename
172.016.002.210.00022-205.153.063.030.01071,
where 1071 is the ephemeral port created for the session.
Since 
tcpflow uses
libpcap, the same packet capture library
tcpdump uses, capture filters are constructed in
exactly the same way and with the same syntax. It can be used in a
number of ways. For example, you could see what cookies are being
sent during an HTTP session. Or you might use it to see if SSH is
really encrypting your data. Of course, you could also use it to
capture passwords or read email, so be sure to set permissions
correctly.
 
5.5.4. tcp-reduce
The program 
tcp-reduce
invokes a collection of shell scripts to reduce the packet capture
information in a 
tcpdump trace file to one-line
summaries for each connection. That is, an entire Telnet session
would be summarized by a single line. This could be extremely useful
in getting an overall picture of how the traffic over a link breaks
down or for looking quickly at very large files.
The syntax is quite simple.
bsd1# tcp-reduce tracefile > outfile
will reduce 
tracefile,
putting the output in 
outfile. The program
tcp-summary, which comes with
tcp-reduce, will further summarize the results.
For example, on my system I traced a system briefly with
tcpdump. This process collected 741 packets.
When processed with 
tcp-reduce, this revealed 58
TCP connections. Here is an example when results were passed to
tcp-summary :
bsd1# tcp-reduce out-file | tcp-summary
This example produced the following five-line summary:
proto        # conn   KBytes    % SF % loc % ngh
-----        ------   ------    ---- ----- -----
www              56       35      25     0     0
telnet            1        1     100     0     0
pop-3             1        0     100     0     0
In this instance, this clearly shows that the HTTP traffic dominated
the local network traffic.
 
5.5.5. tcpshow
The program 
tcpshow
decodes a 
tcpdump trace file. It represents an
alternative to using 
tcpdump to decode data. The
primary advantage of 
tcpshow is much nicer
formatting for output. For example, here is the
tcpdump output for a packet:
12:36:54.772066 sloan.lander.edu.1174 > 205.153.63.238.telnet: . ack 
3259091394 win 8647 (DF) b
Here is corresponding output from 
tcpshow for
the same packet:
-----------------------------------------------------------------------
Packet 1
TIME:   12:36:54.772066
LINK:   00:10:5A:A1:E9:08 -> 00:10:5A:E3:37:0C type=IP
  IP:   sloan -> 205.153.63.238 hlen=20 TOS=00 dgramlen=40 id=B30C
        MF/DF=0/1 frag=0 TTL=128 proto=TCP cksum=2D84
 TCP:   port 1174 -> telnet seq=0016775603 ack=3259091394
        hlen=20 (data=0) UAPRSF=010000 wnd=8647 cksum=E869 urg=0
DATA:   <No data>
-----------------------------------------------------------------------
The syntax is:
bsd1# tcpshow < trace-file
There are numerous options.
 
5.5.6. tcpslice
The program
tcpslice is a simple but useful program for
extracting pieces or merging 
tcpdump files. This
is a useful utility for managing larger 
tcpdump
files. You specify a starting time and optionally an ending time for
a file, and it extracts the corresponding records from the source
file. If multiple files are specified, it extracts packets from the
first file and then continues extracting only those packets from the
next file that have a later timestamp. This prevents duplicate
packets if you have overlapping trace files.
While there are a few options, the basic syntax is quite simple. For
example, consider the command:
bsd1# tcpslice 934224220.0000 in-file > out-file
This will extract all packets with
timestamps after 
934224220.0000. Note the use of
an unformatted timestamp. This is the same format displayed with the
-tt option with 
tcpdump.
Note also the use of redirection. Because it works with binary files,
tcpslice will not allow you to send output to
your terminal. See the manpage for additional options.
 
5.5.7. tcptrace
This program is an extremely
powerful 
tcpdump file analysis tool. The program
tcptrace is strictly an analysis tool, not a
capture program, but it works with a variety of capture file formats.
The tool's primary focus is the analysis of TCP connections. As
such, it is more of a network management tool than a packet analysis
tool. The program provides several levels of output or analysis
ranging from very brief to very detailed.
While for most purposes
tcptrace is used as a command-line tool,
tcptrace is capable of producing several types
of output files for plotting with the X Window program
xplot. These include 
time sequence
graphs, 
throughput graphs, and graphs
of 
round-trip times. Time sequence graphs
(
-S option) are plots of sequence numbers over
time that give a picture of the activity on the network. Throughput
graphs (
-T option), as the name implies, plot
throughput in bytes per second against time. While throughput gives a
picture of the volume of traffic on the network, round-trip times
give a better picture of the delays seen by individual connections.
Round-trip time plots (
-R option) display
individual round-trip times over time. For other graphs and graphing
options, consult the documentation.
For normal text-based operations,
there are an overwhelming number of options and possibilities. One of
the most useful is the 
-l option. This produces
a long listing of summary statistics on a connection-by-connection
basis. What follows is an example of the information provided for a
single brief Telnet connection:
TCP connection 2:
        host c:        sloan.lander.edu:1230
        host d:        205.153.63.238:23
        complete conn: yes
        first packet:  Wed Aug 11 11:23:25.151274 1999
        last packet:   Wed Aug 11 11:23:53.638124 1999
        elapsed time:  0:00:28.486850
        total packets: 160
        filename:      telnet.trace
   c->d:                              d->c:
     total packets:            96           total packets:            64
     ack pkts sent:            95           ack pkts sent:            64
     pure acks sent:           39           pure acks sent:           10
     unique bytes sent:       119           unique bytes sent:      1197
     actual data pkts:         55           actual data pkts:         52
     actual data bytes:       119           actual data bytes:      1197
     rexmt data pkts:           0           rexmt data pkts:           0
     rexmt data bytes:          0           rexmt data bytes:          0
     outoforder pkts:           0           outoforder pkts:           0
     pushed data pkts:         55           pushed data pkts:         52
     SYN/FIN pkts sent:       1/1           SYN/FIN pkts sent:       1/1
     mss requested:          1460 bytes     mss requested:          1460 bytes
     max segm size:            15 bytes     max segm size:           959 bytes
     min segm size:             1 bytes     min segm size:             1 bytes
     avg segm size:             2 bytes     avg segm size:            23 bytes
     max win adv:            8760 bytes     max win adv:           17520 bytes
     min win adv:            7563 bytes     min win adv:           17505 bytes
     zero win adv:              0 times     zero win adv:              0 times
     avg win adv:            7953 bytes     avg win adv:           17519 bytes
     initial window:           15 bytes     initial window:            3 bytes
     initial window:            1 pkts      initial window:            1 pkts
     ttl stream length:       119 bytes     ttl stream length:      1197 bytes
     missed data:               0 bytes     missed data:               0 bytes
     truncated data:            1 bytes     truncated data:         1013 bytes
     truncated packets:         1 pkts      truncated packets:         7 pkts
     data xmit time:       28.479 secs      data xmit time:       27.446 secs
     idletime max:         6508.6 ms        idletime max:         6709.0 ms
     throughput:                4 Bps       throughput:               42 Bps
This was produced by using 
tcpdump to capture
all traffic into the file 
telnet.trace and then
executing 
tcptrace to process the data. Here is
the syntax required to produce this output:
bsd1# tcptrace -l telnet.trace
Similar output is produced for each TCP connection recorded in the
trace file. Obviously, a protocol (like HTTP) that uses many
different sessions may overwhelm you with output.
There is a lot more to this program than covered in this brief
discussion. If your primary goal is analysis of network performance
and related problems rather than individual packet analysis, this is
a very useful tool.
 
 
5.5.8. trafshow
The program 
trafshow
is a packet capture program of a different sort. It provides a
continuous display of traffic over the network, giving repeated
snapshots of traffic. It displays the source address, destination
address, protocol, and number of bytes. This program would be most
useful in looking for suspicious traffic or just getting a general
idea of network traffic.
While 
trafshow can be run on a text-based
terminal, it effectively takes over the display. It is best used in a
separate window of a windowing system. There are a number of options,
including support for packet filtering using the same filter format
as 
tcpdump.
 
5.5.9. xplot
The 
xplot program is
an X Windows plotting program. While it is a general purpose plotting
program, it was written as part of a thesis project for TCP analysis
by David Clark. As a result, some support for plotting TCP data
(oriented toward network analysis) is included with the package. It
is also used by 
tcptrace. While a powerful and
useful program, it is not for the faint of heart. Due to the lack of
documentation, the program is easiest to use with
tcptrace rather than as a standalone
program.
 
 
5.5.10. Other Packet Capture Programs
We have
discussed 
tcpdump in detail because it is the
most widely available packet capture program for Unix. Many
implementations of Unix have proprietary packet capture programs that
are comparable to 
tcpdump. For example, Sun
Microsystems' Solaris provides 
snoop.
(This is a replacement for 
etherfind, which was
supplied with earlier versions of the Sun operating system.)
Here is an example of using 
snoop to capture
five packets:
sol1> snoop -c5
Using device /dev/elxl (promiscuous mode)
172.16.2.210 -> sol1         TELNET C port=28863
        sol1 -> 172.16.2.210 TELNET R port=28863 /dev/elxl (promiscuo
172.16.2.210 -> sol1         TELNET C port=28863
172.16.2.210 -> sloan.lander.edu TCP D=1071 S=22     Ack=143990 Seq=3737542069 Len=60 Win=17520
sloan.lander.edu -> 172.16.2.210 TCP D=22 S=1071     Ack=3737542129 Seq=143990 Len=0 Win=7908
snoop: 5 packets captured
As you can see, it is used pretty much the same way as
tcpdump. (Actually, the output has a slightly
more readable format.) 
snoop, like
tcpdump, supports a wide range of options and
filters. You should have no trouble learning
snoop if you have ever used
tcpdump.
Other systems will provide their own
equivalents (for example, AIX provides 
iptrace
). While the syntax is different, these tools are used in much the
same way.
 
|  |  |  | 
| 5.4. tcpdump |  | 5.6. Packet Analyzers |