For a data recovery program I need to be able to extract the values+types from files written by NSArchiver, without having access to Apple\'s CF / NS frameworks.
The OS
It seems to be part of the GNU Objective-C runtime, even though it's not exactly runtime stuff (see the discussion at: http://gcc.gnu.org/ml/gcc-patches/2010-09/msg00495.html)
This file may implement that stuff: https://github.com/gnustep/libobjc/blob/master/archive.c
While I don't know any documentation of the format, you may find the information you are looking for by checking the public source code from older Darwin (or maybe OpenStep) versions.
For example, have a look at the implementation of typedstream
in the file typedstream.m
in objc-1.tar.gz
available at this mirror of an old darwin distribution.
This source code should be able to read/write typedstream
. Just be sure to confirm to Apple's license when using it.
Part of the issue here is that each class in Cocoa/NeXTSTEP/OPENSTEP knows how to archive itself. In each class there is an initWithCoder:/encodeWithCoder: method and inside there is a section for typedstream and another section for keyed archives. Keyed archives are more modern and are usually expressed as XML plists. These can be encoded in binary form, but, make no mistake, this binary form is NOT the same as a typedstream archive. Further they are keyed so that it's easy to pull out individual pieces of data without having to read all of the data which came before. Typedstream archives don't work this way. They are order based which means that each element in each object is written one after the other. First the class name, then the version, then each of the pieces of data. The reason GNUstep never implemented this is because the order of encoding is nearly impossible to discover.
When you archive the root object of an object graph it calls the encodeWithCoder: method on that object which in turn calls the encodeWithCoder: methods on each of the objects it contains and so on recursively until the entire object graph is archived. When this is done using keyed archives (NSKeyedArchiver) the archive is built and keyed appropriately. When it is done with a typed stream archive (NSArchiver) the same recursion happens but each time an object is encoded it just dumps each element out into the archive in whatever order the developer deemed appropriate at the time.
I hope this explanation clears things up a little. You have a hard road ahead of you. There were reasons doing this was avoided in GNUstep. If we had, we would STILL be trying to figure it out.
Frank Illenberger wrote a NSUnarchiver replacement called MEUnarchiver
based on the 1999's typedstream.m source code: https://github.com/depth42/MEUnarchiver
It has been extended to support newer types that are not known to the original source code. It still relies on the ObjC runtime to provide NSCoding decoder implementations for all the standard types such as NSString etc, but otherwise it is pretty self-contained and allows me to prevent crashes that occur with Apple's NSUnarchiver code when passing damaged data.
Take a look at Cocotron's open source implementation of NSArchiver
and NSUnarchiver
:
https://code.google.com/p/cocotron/source/browse/Foundation/NSArchiver.m https://code.google.com/p/cocotron/source/browse/Foundation/NSUnarchiver.m
First, please see Is there a way to read in files in TypedStream format for some interesting info.
Very probably, the format can be converted to something more readable using the plutil
tool. This tool is also available for windows (it comes with iTunes for windows). Not sure about its license though.
The problematic part is the fact that the files contain object instances converted to binary. It's not enough to understand the file format, it's necessary to understand how every type is stored.