问题
What is the best option for writing (appending) records to file in highly parallel web environment in .net4 IIS7? I use ashx http handler to receive small portions of data that should be written to file quickly. First I used:
using (var stream = new FileStream(fileName, FileMode.Append, FileAccess.Write, FileShare.ReadWrite, 8192))
{
stream.Write(buffer, 0, buffer.Length);
}
But I noticed that some records were broken or incomplete, probably because of FileShare.ReadWrite. Next I tried to chage it to FileShare.Read. There where no broken records then, but from time to time I got this exception: System.IO.IOException: The process cannot access the file ... because it is being used by another process.
Ideally I would like the operating system to queue concurrent write requests so that all the records would be written eventually. What file access API should I use?
回答1:
there are two options, depending on the size. If the size is small, probably the best option is to simply synchronize access to the file by some shared lock. If possible, it would also be a good idea to keep the file open (flushing occasionally), rather than constantly open/close. For example:
class MeaningfulName : IDisposable {
FileStream file;
readonly object syncLock = new object();
public MeaningfulName(string path) {
file = new FileStream(fileName, FileMode.Append, FileAccess.Write,
FileShare.ReadWrite, 8192);
}
public void Dispose() {
if(file != null) {
file.Dispose();
file = null;
}
}
public void Append(byte[] buffer) {
if(file == null) throw new ObjectDisposedException(GetType().Name);
lock(syncLock) { // only 1 thread can be appending at a time
file.Write(buffer, 0, buffer.Length);
file.Flush();
}
}
}
That is thread-safe, and could be made available to all the ashx without issue.
However, for larger data, you might want to look at a synchronized reader-writer queue - i.e. all the writers (ashx hits) can throw data onto the queue, with a single dedicated writer thread dequeuing them and appending. That removes the IO time from the ashx, however you might want to cap the queue size in case the writer can't keep up. There's a sample here of a capped synchronized reader/writer queue.
回答2:
Unless you're using a web garden or web farm, I'd suggest using process-local locking (lock(){}), and perform as much processing as possible outside of the lock.
If you have multiple files you're writing to, see Better solution to multithreading riddle? for a good solution.
来源:https://stackoverflow.com/questions/8442119/what-is-the-fastest-and-safest-way-to-append-records-to-disk-file-in-highly-load