I have a 100 GB text file, which is a BCP dump from a database. When I try to import it with BULK INSERT
, I get a cryptic error on line number 219506324. Before sol
Here's my elegant version in C#:
Console.Write(File.ReadLines(@"s:\source\transactions.dat").ElementAt(219506323));
or more general:
Console.Write(File.ReadLines(filename).ElementAt(linenumber - 1));
Of course, you may want to show some context before and after the given line:
Console.Write(string.Join("\n",
File.ReadLines(filename).Skip(linenumber - 5).Take(10)));
or more fluently:
File
.ReadLines(filename)
.Skip(linenumber - 5)
.Take(10)
.AsObservable()
.Do(Console.WriteLine);
BTW, the linecache
module does not do anything clever with large files. It just reads the whole thing in, keeping it all in memory. The only exceptions it catches are I/O-related (can't access file, file not found, etc.). Here's the important part of the code:
fp = open(fullname, 'rU')
lines = fp.readlines()
fp.close()
In other words, it's trying to fit the whole 100GB file into 6GB of RAM! What the manual should say is maybe "This function will never throw an exception if it can't access the file."