I have a huge text file, size > 4GB and I want to replace some text in it programmatically. I know the line number at which I have to replace the text but the problem is tha
I'm guessing you'll want to use the FileStream class and seek to your positon, and place your updated data.
Hello I tested the following -works well.This caters to variable length lines separated by Environment.NewLine. if you have fixed length lines you can straightaway seek to it.For converting bytes to string and vice versa you can use Encoding.
static byte[] ReadNextLine(FileStream fs)
{
byte[] nl = new byte[] {(byte) Environment.NewLine[0],(byte) Environment.NewLine[1] };
List<byte> ll = new List<byte>();
bool lineFound = false;
while (!lineFound)
{
byte b = (byte)fs.ReadByte();
if ((int)b == -1) break;
ll.Add(b);
if (b == nl[0]){
b = (byte)fs.ReadByte();
ll.Add(b);
if (b == nl[1]) lineFound = true;
}
}
return ll.Count ==0?null: ll.ToArray();
}
static void Main(string[] args)
{
using (FileStream fs = new FileStream(@"c:\70-528\junk.txt", FileMode.Open, FileAccess.ReadWrite))
{
int replaceLine=1231;
byte[] b = null;
int lineCount=1;
while (lineCount<replaceLine && (b=ReadNextLine(fs))!=null ) lineCount++;//Skip Lines
long seekPos = fs.Position;
b = ReadNextLine(fs);
fs.Seek(seekPos, 0);
string line=new string(b.Select(x=>(char)x).ToArray());
line = line.Replace("Text1", "Text2");
b=line.ToCharArray().Select(x=>(byte)x).ToArray();
fs.Write(b, 0, b.Length);
}
}
Unless the new text is exactly the same size as the old text, you will have to re-write the file. There is no way around it. You can at least do this without keeping the entire file in memory.
Since the file is so large you may want to take a look at the .NET 4.0 support for memory mapped files. Basically you'll need to move the file/stream pointer to the location in the file, overwrite that location, then flush the file to disk. You won't need to load the entire file into memory.
For example, without using memory mapped files, the following will overwrite a part of an ascii file. Args are the input file, the zero based start index and the new text.
static void Main(string[] args)
{
string inputFilename = args[0];
int startIndex = int.Parse(args[1]);
string newText = args[2];
using (FileStream fs = new FileStream(inputFilename, FileMode.Open, FileAccess.Write))
{
fs.Position = startIndex;
byte[] newTextBytes = Encoding.ASCII.GetBytes(newText);
fs.Write(newTextBytes, 0, newTextBytes.Length);
}
}