I am trying to find the position of the first Central Directory file header in a Zip file.
I\'m reading these: http://en.wikipedia.org/wiki/Zip_(file_format) http://www.
Start at the end and scan towards the beginning, looking for the end of directory signature and counting the number of bytes you have scanned. When you find a candidate, get the byte 20 offset for the comment length (L). Check if L + 20 matches your current count. Then check that the start of the central directory (pointed to by the byte 12 offset) has an appropriate signature.
If you assumed the bits were pretty random when the signature check happened to be a wild guess (e.g. a guess landing into a data segment), the probability of getting all the signature bits correct is pretty low. You could refine this and figure out the chance of landing in a data segment and the chance of hitting a legitimate header (as a function of the number of such headers), but this is already sounded like a low likelihood to me. You could increase your confidence level by then checking the signature of the first file record listed, but be sure to handle the boundary case of an empty zip file.