问题
I have a very large tcpdump file that I split into 1 minute intervals. I am able to use tshark to extract TCP statistics for each of the 1 minute files using a loop code and save the results as a CSV file so I can perform further analysis in Excel. Now I want to be able to count the number of TCP flows in each 1 minute file for all the 1 minute files and save the data in a CSV file. A TCP flow here represents group of packets going from a specific source to a specific destination. Each flow has statistics such as source IP, dest IP, #pcakets from A->B, #bytes from A->B, #packets from B->A, #bytes from B->A, total packets, total bytes, etc. And I just want to count the number of TCP flows in each of the 1 minute files. From what I’ve read so far, it seems I need to create a dissector to do that. Can anyone give me pointers or code on how to get started? Thanks.
回答1:
Tshark has a command to dump all of the necessary information: tshark -qz conv,tcp -r FILE
. This writes one line per flow (plus a header and footer) so to count the flows just count the lines and subtract the header/footer.
回答2:
Not a dissector, but a tap. See the Wireshark README.tapping document, and see the TShark iousers tap for a, sadly, not at all simple example in C.
It's also possible to write taps in Lua; see, for example, the Lua/Taps page in the Wireshark Wiki and the Lua Support in Wireshark section of the Wireshark User's Manual.
The C structure passed to TCP taps for each packet is:
/* the tcp header structure, passed to tap listeners */
typedef struct tcpheader {
guint32 th_seq;
guint32 th_ack;
gboolean th_have_seglen; /* TRUE if th_seglen is valid */
guint32 th_seglen;
guint32 th_win; /* make it 32 bits so we can handle some scaling */
guint16 th_sport;
guint16 th_dport;
guint8 th_hlen;
guint16 th_flags;
guint32 th_stream; /* this stream index field is included to help differentiate when address/port pairs are reused */
address ip_src;
address ip_dst;
/* This is the absolute maximum we could find in TCP options (RFC2018, section 3) */
#define MAX_TCP_SACK_RANGES 4
guint8 num_sack_ranges;
guint32 sack_left_edge[MAX_TCP_SACK_RANGES];
guint32 sack_right_edge[MAX_TCP_SACK_RANGES];
} tcp_info_t;
So, for C-language taps, the "data" argument to the tap listener's "packet" routine points to a structure of that sort.
For Lua taps, the "tapinfo" table passed as the third argument to the tap listener's "packet" routine is described as "a table of info based on the Listener's type, or nil.". For a TCP tap, the entries in the table include all the fields in that structure except for sack_left_edge
and sack_right_edge
; the keys in the table are the structure member names.
The th_stream
field identifies the connection; each time the TCP dissector finds a new connection, it assigns a new value. As the comment indicates, "this stream index field is included to help differentiate when address/port pairs are reused", so that if a given connection is closed, and a later connection uses the same endpoints, the two connections have different th_stream
values even though they have the same endpoints.
So you'd have a table using the th_stream
value as a key. The table would store the endpoints (addresses and ports) and counts of packets and bytes in each direction. For each packet passed to the listener's "packet" routine, you'd look up the th_stream
value in the table and, if you don't find it, you'd create a new entry, starting the counts off at zero, and use that new entry; otherwise, you'd use the entry you found. You'd then figure out whether the packet was going from A to B or B to A, and increase the appropriate packet count and byte count.
You'd also keep track of the time stamp. For the first packet, you'd store the time stamp for that packet. For each packet, you'd look at the time stamp and, if it's one minute or more later than the stored time stamp, you'd:
- dump out the statistics from the table of connections;
- empty out the table of connections;
- store the new packet's time stamp, replacing the previous stored time stamp.
来源:https://stackoverflow.com/questions/23877516/writing-a-wireshark-dissector-to-count-number-of-tcp-flows