Deterministic builds under Windows

前端 未结 4 649
北恋
北恋 2020-12-02 11:11

The ultimate goal is comparing 2 binaries built from exact same source in exact same environment and being able to tell that they indeed are functionally equivalent.

<
相关标签:
4条回答
  • 2020-12-02 11:54

    Standardise Build Paths

    A simple solution would be to standardise on your build paths, so they are always of the form, for example:

    c:\buildXXXX
    

    Then, when you compare, say, build0434 to build0398, just preprocess the binary to change all occurrences of build0434 to build0398. Choose a pattern you know is unlikely to show up in your actual source/data, except in those strings the compiler/linker embed into the PE.

    Then you can just do your normal difference analysis. By using the same length pathnames, you won't shift any data around and cause false positives.

    Dumpbin utility

    Another tip is to use dumpbin.exe (ships with MSVC). Use dumpbin /all to dump all details of a binary to a text/hex dump. This can make it more obvious to see what/where is changing.

    For example:

    dumpbin /all program1.exe > program1.txt
    dumpbin /all program2.exe > program2.txt
    windiff program1.txt program2.txt
    

    Or use your favourite text diffing tool, instead of Windiff.

    Bindiff utility

    You may find Microsoft's bindiff.exe tool useful, which can be obtained here:

    Windows XP Service Pack 2 Support Tools

    It has a /v option, to instruct it to ignore certain binary fields, such as timestamps, checksums, etc.:

    "BinDiff uses a special compare routine for Win32 executable files that masks out various build time stamp fields in both files when performing the compare. This allows two executable files to be marked as "Near Identical" when the files are truely identical, except for the time they were built."

    However, it sounds like you may be already doing a superset of what bindiff.exe does.

    0 讨论(0)
  • 2020-12-02 12:12

    I solved this to an extent.

    Currently we have build system that makes sure all new builds are on the path of constant length (builds/001, builds/002, etc), thus avoiding shifts in the PE layout. After build a tool compares old and new binaries ignoring relevant PE fields and other locations with known superficial changes. It also runs some simple heuristics to detect dynamic ignorable changes. Here is full list of things to ignore:

    • PE timestamp and checksum
    • Digital signature directory entry
    • Export table timestamp
    • Debugger section timestamp
    • PDB signature, age and file path
    • Resources timestamp
    • All file/product versions in VS_VERSION_INFO resource
    • Digital signature section
    • MIDL vanity stub for embedded type libraries (contains timestamp string)
    • __FILE__, __DATE__ and __TIME__ macros when they are used as literal strings (can be wide or narrow char)

    Once in a while linker would make some PE sections bigger without throwing anything else out of alignment. Looks like it moves section boundary inside the padding -- it is zeros all around anyway, but because of it I'll get binaries with 1 byte difference.

    UPDATE: we recently opensourced the tool on GitHub. See Compare section in documentation.

    0 讨论(0)
  • 2020-12-02 12:14

    Have you tried disassembling the executable and comparing the disassembly? That should remove a lot of the distracting details you mention, and make removing others a lot easier.

    0 讨论(0)
  • 2020-12-02 12:14

    Is there a way to either force compiler to use relative paths, or to fool it into thinking the path is not what it is?

    You have two ways to do this:

    1. Use the subst.exe command and map a drive letter to the build folder (this may not be reliable).
    2. If subst.exe doesn't work, then create shares for each of your build folders and use the "net use" command. This one almost certainly should work.

    In either case, you're going to map and reuse the same drive letter for a folder before you start a particular build, so that the path appears identical to the compiler.

    0 讨论(0)
提交回复
热议问题