Is there any order to the data in the files? The reason I ask is that though a line by line comparison would take an eternity, going through one file line by line whilst doing a binary search in the other would be much quicker. This can only work if the data is sorted in a particular way though.