I\'ll give you a little bit of background first as to why I\'m asking this question:
I am currently working in a stricly-regulated industry and as such our code is quite
Take a look at the answers from this question. Especially on the external link provided in the 3rd one.
EDIT:
I actually wantetd to link to this article.
Update: Roslyn seems to have a /feature:deterministic
compiler flag for reproducible builds, although it's not 100% working yet.
You should be able to get rid of the debug GUID by disabling PDB generation. If not, setting the GUID to zeroes is fine - only debuggers look at that section (you won't be able to debug the assembly anymore, but it should still run fine).
The PrivateImplementationDetails are a bit more difficult - these are internal helper classes generated by the compiler for certain language constructs (array initializers, switch statements using strings, etc.). Because they are only used internally, the class name doesn't really matter, so you could just assign a running number to them.
I would do this by going through the #Strings metadata stream and replacing all strings of the form "<PrivateImplementationDetails>{GUID}" with "<PrivateImplementationDetails>{running number, padded to same length as a GUID}".
The #Strings metadata stream is simply the list of strings used by the metadata, encoded in UTF-8 and separated by \0; so finding and replacing the names should be easy once you know where the #Strings stream is inside the executable file.
Unfortunately the "metadata stream headers" containing this information are quite buried inside the file format. You'll have to start at the NT Optional Header, find the pointer to the CLI Runtime Header, resolve it to a file position using the PE section table (it's an RVA, but you need a position inside the file), then go to the metadata root and read the stream headers.
I'm not sure about this, but just a thought: are you using any anonymous types for which the compiler might generate names behind the scenes, which might be different each time the compiler runs? Just a possibility which occurred to me. Probably one for Jon Skeet ;-)
Update: You could perhaps also use Reflector addins for comparison and disassembly.
Use ildasm.exe to fully disassemble both programs and compare the IL. Then you can "clean" the code using text-based methods and (predictably) recompile it again.
You said that after a few project tweaks you were able to get C++ apps to compile repeatably to the same SHA1/MD5 values. I'm in the same boat as you in being in an industry with a third party test lab that needs to rebuild exactly the same executables repeatably.
In researching how to make this happen in VS2005, I came across your post here. Could you share the project tweaks you did to make the C++ apps build to the same SHA1/MD5 values consistently? It would be of great help to myself and perhaps any others that share this requirement.
Regarding the PDB GUID problem, if you specify that a PDB shouldn't be generated at compilation for Release builds, does the binary still contain the PDB's file system GUID?
To disable PDB generation:
If you're building from the console, use /debug- to get the same result.