What\'s the best way to unit test large data sets? Some legacy code that I\'m maintaining has structures of a hundred members or more; other parts of the code that we\'re worki
If what you are trying to achieve is, in fact, a unit test you should mock out the underlying data structures and simulate the data. This technique gives you complete control over the inputs. For example, each test you write may handle a single data point and you'll have a very concise set of tests for each condition. There are several open source mocking frameworks out there, I personally recommend Rhino Mocks (http://ayende.com/projects/rhino-mocks/downloads.aspx) or NMock (http://www.nmock.org).
If it is not possible for you to mock out the data structures I recommend refactoring so you are able to :-) Its worth it! Or you may also want to try TypeMock (http://www.typemock.com/) which allows mocking of concrete classes.
If, however, if you're doing tests against large data sets you're really running functional tests not unit tests. In which case loading data into a database or from disk is a typical operation. Rather than avoid it you should work on getting it running in parallel with the rest of your automated build process so the performance impact isn't holding any of your developers up.