I am about to start a project which will be taking blocks of text, parsing a lot of data into them into some sort of object which can then be serialized, stored, and statistics
Google had recently announced it's internal text processing language (which seems like a Python/Perl subset made for heavily parallel processing).
http://code.google.com/p/szl/ - Sawzall.