Find HEX value in file and grep the following value

大城市里の小女人 提交于 2019-12-11 08:58:59

问题


I have a 2GB file in raw format. I want to search for all appearance of a specific HEX value "355A3C2F74696D653E" AND collect the following 28 characters.

Example: 355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135

In this case I want the output: "323031312D30342D32365431343A34373A30322D31343A34373A3135" or better: 2011-04-26T14:47:02-14:47:15

I have tried with

xxd -u InputFile | grep '355A3C2F74696D653E' | cut -c 1-28 > OutputFile.txt

and

xxd -u -ps -c 4000000 InputFile | grep '355A3C2F74696D653E' | cut -b 1-28 > OutputFile.txt

But I can't get it working.

Can anybody give me a hint?


回答1:


If your grep supports -P parameter then you could simply use the below command.

$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{28}'
323031312D30342D32365431343A

For 56 chars,

$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{56}'
323031312D30342D32365431343A34373A30322D31343A34373A3135



回答2:


As you are using xxd it seems to me that you want to search the file as if it were binary data. I'd recommend using a more powerful programming language for this; the Unix shell tools assume there are line endings and that the text is mostly 7-bit ASCII. Consider using Python:

#!/usr/bin/python
import mmap
fd = open("file_to_search", "rb")
needle = "\x35\x5A\x3C\x2F\x74\x69\x6D\x65\x3E"
haystack = mmap.mmap(fd.fileno(), length = 0, access = mmap.ACCESS_READ)
i = haystack.find(needle)
while i >= 0:
    i += len(needle)
    print (haystack[i : i + 28])
    i = haystack.find(needle, i)



回答3:


Why convert to hex first? See if this awk script works for you. It looks for the string you want to match on, then prints the next 28 characters. Special characters are escaped with a backslash in the pattern.

Adapted from this post: Grep characters before and after match?

I added some blank lines for readability.

VirtualBox:~$ cat data.dat

Thisis a test of somerandom characters before thestringI want5Z</time>2011-04-26T14:47:02-14:47:15plus somemoredata

VirtualBox:~$ cat test.sh

awk '/5Z\<\/time\>/ {
  match($0, /5Z\<\/time\>/); print substr($0, RSTART + 9, 28);
}' data.dat

VirtualBox:~$ ./test.sh

2011-04-26T14:47:02-14:47:15

VirtualBox:~$ 

EDIT: I just realized something. The regular expression will need to be tweaked to be non-greedy, etc and between that and awk need to be tweaked to handle multiple occurrences as you need them. Perhaps some of the folks more up on awk can chime in with improvements as I am real rusty. An approach to consider anyway.



来源:https://stackoverflow.com/questions/29972507/find-hex-value-in-file-and-grep-the-following-value

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!