问题
I am trying to analyse a file containing packets captured using tcpdump. I first want to categorize the packets into flows using 5-tuple. Then I need to get the size and inter-arrival time of each packet in each flow. I tried Conversation list in wireshark but it gives only the number of packets in the flow not information about each packet in the flow. A suggestion for any code (c++ or shell script) that can do the job? Thank you
回答1:
UmNyobe,
If you haven't heard of Scapy yet I beleive what you are trying to do would be a near perfect fit. For example I wrote this little snippet to parse a pcap field and give me something like what you are talking about using Scapy.
#!/usr/bin/python -tt
from scapy import *
import sys
from datetime import datetime
'''Parse PCAP files into easy to read NETFLOW like output\n
Usage:\n
python cap2netflow.py <[ pcap filename or -l ]>\n
-l is live capture switch\n
ICMP packets print as source ip, type --> dest ip, code'''
def parse_netflow(pkt):
# grabs 'netflow-esqe' fields from packets in a PCAP file
try:
type = pkt.getlayer(IP).proto
except:
pass
snifftime = datetime.fromtimestamp(pkt.time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[1]
if type == 6:
type = 'TCP'
if type == 17:
type = 'UDP'
if type == 1:
type = 'ICMP'
if type == 'TCP' or type == 'UDP':
print( ' '.join([snifftime, type.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , str(pkt.getlayer(type).sport).rjust(5, ' ') , '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' ') , str(pkt.getlayer(type).dport).rjust(5, ' ')]))
elif type == 'ICMP':
print(' '.join([snifftime, 'ICMP'.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , ('t: '+ str(pkt.getlayer(ICMP).type)).rjust(5, ' '), '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' '), ('c: ' + str(pkt.getlayer(ICMP).code)).rjust(5, ' ')]))
else:
pass
if '-l' in sys.argv:
sniff(prn=parse_netflow)
else:
pkts = rdpcap(sys.argv[1])
print(' '.join(['Date: ',datetime.fromtimestamp(pkts[0].time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[0]]))
for pkt in pkts:
parse_netflow(pkt)
Install Python and Scapy then use this to get you started. Let me know if you need any assistance figuring it all out, if you know C++ chances are this will already make alot of sense to you.
Get Scapy here
http://www.secdev.org/projects/scapy/
There are tons of links on this page to helpful tutorials, keep in mind Scapy does alot more but hone in on the areas that talk about pcap parsing..
I hope this helps!
dc
回答2:
I worked on a library to analyze tcp dump but it was for a business so I cannot just give to you. if you don't find what you are looking for then my answer can help. A tcpdump is just nested network data like the Matryoshka dolls, where the pcap layer is added by tcpdump.
If you only want to work on the captures, the format of a dump is specified in Libpcap File Format. To get the size and time of arrival of each packet you need to process the dump using this specification.
If you have to go deeper in the analysis these are the following layers in order
- the link layer
- the internet layer
- Transport layer
- The application layer
Each layer has a header definition. So you need to find which protocol stack your pcap data contains and to parse the header to get information.
回答3:
What are the members of the 5-tuple? If the flows are TCP or UDP, the source and destination IP addresses and port numbers, plus, perhaps, a number to distinguish multiple flows over time between the two endpoints would work; for SCTP, it would be similar, although if a flow is a stream, you might need more.
If the members of the 5-tuple are all "named fields" in Wireshark, you could use TShark with the -T fields
option, and use the -e
option to specify which fields to print, and select a field with the time stamp (frame.time_epoch
would give you the time as seconds and fractions of a second since the UN*X epoch), a field the appropriate size (frame.len
gives you the raw number of bytes in the link-layer packet PLUS any meta-data such as a radiotap header for 802.11 radio information), and the other fields, and then feed the output of TShark to a script or program that does the processing you want to do. That lets TShark do the processing of the protocol layers, so that your program only needs to process the resulting data.
来源:https://stackoverflow.com/questions/10207423/code-to-analyze-pcap-file