Pyshark: can only get first field value if same key name (field name) show multiple entries with different value

好久不见. 提交于 2019-12-23 18:11:10

问题


I am using Pyshark to parse Wireshark sniffer log, and I used exported Json format file (based on pcapny file) to find field names when use 'get_field_value' function to retrieve field value.

For example, in order to get BSSID value:

  • In Json format file, this info is displayed as

    "wlan.bssid": "11:22:33:44:55:66"
    
  • Then I could use:

    value = packet['wlan'].get_field_value('bssid')
    
  • Result is expected:

    value == '11:22:33:44:55:66'
    
  • For this case, it is working fine.

But I encounter an issue with below condition when I move to 'wlan_mgt' section in a beacon packet as example showing below: - In Json format file, it shows:

      "wlan_mgt.tagged.all": {
        "wlan_mgt.tag": {
          "wlan_mgt.tag.number": "0",
          "wlan_mgt.tag.length": "5",
          "wlan_mgt.ssid": "MWIFI"
        },
        "wlan_mgt.tag": {
          "wlan_mgt.tag.number": "1",
          "wlan_mgt.tag.length": "6",
          "wlan_mgt.supported_rates": "24",
          "wlan_mgt.supported_rates": "164",
          "wlan_mgt.supported_rates": "48",
          "wlan_mgt.supported_rates": "72",
          "wlan_mgt.supported_rates": "96",
          "wlan_mgt.supported_rates": "108"
        },
        "wlan_mgt.tag": {
          "wlan_mgt.tag.number": "5",
          "wlan_mgt.tag.length": "7",
          "wlan_mgt.tim.dtim_count": "0",
          "wlan_mgt.tim.dtim_period": "1",
          "wlan_mgt.tim.bmapctl": "0x00000000",
          "wlan_mgt.tim.bmapctl_tree": {
            "wlan_mgt.tim.bmapctl.multicast": "0",
            "wlan_mgt.tim.bmapctl.offset": "0x00000000"
          },
          "wlan_mgt.tim.partial_virtual_bitmap": "00:10:00:00",
          "wlan.tim.aid": "0x0000000c"
        },

As we can see, there are multiple entries for "wlan_mgt.supported_rates", the field name (key) are the same, and the value for each entry is different which I will need to get them all. But if I use: - If I use:

    value = packet['wlan_mgt'].get_field_value('supported_rates')

- Then it only gives me value '24' which is the value of 1st entry. And I have no idea how to retrieve other entry values since the key name is the same.

Should it return a list of all values like ['24', '164','48','72','96','108'], rather than only the 1st entry value? Since based on sniffer log (Json format), there are many other entries with same field name, for example 'wlan_mgt.tag.number', but different field value, so this issue is a blocker for me.

Pls advice how to get all data, and Thanks a lot in advance!

BR,
Alex


回答1:


First off, you don't have to use item subset and get_field_value to get the field values. So instead of

value = packet['wlan_mgt'].get_field_value('supported_rates')

You can use:

value = packet.wlan_mgt.supported_rates

In order to get the tags on a wifi packet in JSON mode, you can use packet.wlan_mgt.tagged.all.tag. That gives you a list of all tags, you can filter that using python to find only the supported rates tag. I was planning on making an extension specifically for WiFi stuff like this since it's cumbersome but I haven't had the chance to yet. If you look at the field on wireshark you can see the category is tagged.all.

Also, when looking for fields and the like, I recommend using an interpreter with autocomplete (such as IPython) so you can just see which fields are available, or just use packet_layer.field_names to see all available fields.




回答2:


I faced a similar problem, I was checking the field option_len and only got one value instead of an array and could not find an easy answer directly; The solution I finally used was to access to the alternative fields available inside the field like in the following code:

ol_arr = []
for x in cap[3].tcp._all_fields.values():
    if x.name == 'tcp.option_len':
        print(x.all_fields)
        for k in x.all_fields:
            print(k.get_default_value())
            ol_arr.append(k.get_default_value())
        break
print(ol_arr)

I hope that this helps




回答3:


This is a serious problem, and it exists in more places in "wireshark tools".

For example, when using tshark for read pcap file.

tshark -r some_file.pcap -T json

its also return json that contain some multiple keys.

This also publish in Wireshark-dev and someone repair this, But the code has not yet been inserted.

You can fix that by using this code:

import json

def parse_object_pairs(pairs):
    """
    This function get list of tuple's
    and check if have duplicate keys.
    if have then return the pairs list itself.
    but if haven't return dict that contain pairs.

    >>> parse_object_pairs([("color": "red"), ("size": 3)])
    {"color": "red", "size": 3}

    >>> parse_object_pairs([("color": "red"), ("size": 3), ("color": "blue")])
    [("color": "red"), ("size": 3), ("color": "blue")]

    :param pairs: list of tuples.
    :return dict or list that contain pairs.
    """
    dict_without_duplicate = dict()
    for k, v in pairs:
        if k in dict_without_duplicate:
            return pairs
        else:
            dict_without_duplicate[k] = v

    return dict_without_duplicate

decoder = json.JSONDecoder(object_pairs_hook=parse_object_pairs)

str_json_can_be_with_duplicate_keys = '{"color": "red", "size": 3, "color": "red"}'

data_after_decode = decoder.decode(str_json_can_be_with_duplicate_keys)


来源:https://stackoverflow.com/questions/43670808/pyshark-can-only-get-first-field-value-if-same-key-name-field-name-show-multi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!