Handling and working with binary data HEX with python

后端 未结 2 717
终归单人心
终归单人心 2021-01-17 02:47

I\'m trying to do a comparison of some byte values - source A comes from a file that is being \'read\':

f = open(fname, \"rb\")
f_data = f.read()
f.close()
<         


        
相关标签:
2条回答
  • 2021-01-17 03:30

    I'm not sure exactly what you're trying to do, but I ran this code in Python 3.2.3.

    #f = open(fname, "rb")
    #f_data = f.read()
    #f.close()
    f_data = b'\x12\x43\xff\xd9\x00\x23'
    eof_markers = {
        'jpg':b'\xff\xd9',
        'pdf':b'\x25\x25\x45\x4f\x46',
        }
    
    for counter in range(-4, 0):
      for name, marker in eof_markers.items():
        print(counter, ('' if marker in f_data[counter:] else '!') + name)
    

    I'm using a hardcoded f_data, but you can undo that by just uncommenting lines 1-3 and comment line 4.

    Here's the output:

    -4 !pdf
    -4 jpg
    -3 !pdf
    -3 !jpg
    -2 !pdf
    -2 !jpg
    -1 !pdf
    -1 !jpg
    

    Is there something this isn't doing that you need to do?

    0 讨论(0)
  • 2021-01-17 03:50

    I can't figure out how to comment on your main post instead of making a subpost. Anyway, I have answers to some of your questions..

    • int(v) converts a formatted number (eg '599') to an integer, not a character(eg "!") to its integer value. You would want ord() for that. However I see no reason you would need to use either in this situation.

    • Hex != binary. Hex is just a numeric base. Binary is raw byte values that may not be printable depending on their value. This is why they show up as escape codes like "\xfd". That's how Python represents unprintable characters to you -- as hex codes.However they are still single characters with no special status -- they don't need conversion. It's perfectly valid to compare 'A' with '\xfd'. Hence, you should be able to do the comparison without any conversion at all.

    • changing 'u' to 'b' will only have any real effect if you're running Python 3.x

    As for directly solving the problem, I feel that while it's clear what you want to do, it's not clear why you have chosen to do things in this way. To get a better answer, you will need to ask a clearer question.

    Here's an example of an alternative approach:

    # convert eof markers to a list of characters
    eof_markers = {k: list(v) for k,v in eof_markers.items()}
    
    # assuming that the bytes you have read in are being added to a list,
    # we can then do a check for the entire EOF string by:
    
    # outer loop reading the next byte, etc, omitted.
    for mname, marker in eof_markers.items():
        nmarkerbytes = len(marker) 
        enoughbytes = len(bytes_buffer) >= nmarkerbytes
        if enoughbytes and bytes_buffer[-nmarkerbytes:] == marker:
            location = f.tell()
            print ('%s marker found at %d' % (mname, location))
    

    There are other, faster approaches using bytes or bytearray (for example, using the 'rfind' method), but this is the simplest approach to explain.

    0 讨论(0)
提交回复
热议问题