问题
I have a bunch of text files in a folder, and all of them should have identical headers. In other words the first 100 lines of all files should be identical. So I wrote a function to check this condition:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
//using (enumerators)
//{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
//}
}
The reason I am using enumerators is memory efficiency. Instead of loading all file contents in memory I enumerate the files concurrently line-by-line until a mismatch is found, or all headers have been examined.
My problem is evident by the commented lines of code. I would like to utilize a using block to safely dispose all the enumerators, but unfortunately using (enumerators)
doesn't compile. Apparently using
can handle only a single disposable object. I know that I can dispose the enumerators manually, by wrapping the whole thing in a try-finally
block, and running the disposing logic in a loop inside finally, but is seems awkward. Is there any mechanism I could employ to make the using
statement a viable option in this case?
Update
I just realized that my function has a serious flaw. The construction of the enumerators is not robust. A locked file can cause an exception, while some enumerators have already been created. These enumerators will not be disposed. This is something I want to fix. I am thinking about something like this:
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposables(f => File.ReadLines(f).GetEnumerator());
The extension method ToDisposables
should ensure that in case of an exception no disposables are left undisposed.
回答1:
You can create a disposable-wrapper over your enumerators
:
class DisposableEnumerable : IDisposable
{
private IEnumerable<IDisposable> items;
public event UnhandledExceptionEventHandler DisposalFailed;
public DisposableEnumerable(IEnumerable<IDisposable> items) => this.items = items;
public void Dispose()
{
foreach (var item in items)
{
try
{
item.Dispose();
}
catch (Exception e)
{
var tmp = DisposalFailed;
tmp?.Invoke(this, new UnhandledExceptionEventArgs(e, false));
}
}
}
}
and use it with the lowest impact to your code:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (var disposable = new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
The thing is you have to dispose those objects separately one by one anyway. But it's up to you where to encapsulate that logic. And the code I've suggested has no manual try-finally
,)
回答2:
I'm going to suggest an approach that uses recursive calls to Zip
to allow parallel enumeration of a normal IEnumerable<string>
without the need to resort to using IEnumerator<string>
.
bool Zipper(IEnumerable<IEnumerable<string>> sources, int take)
{
IEnumerable<string> ZipperImpl(IEnumerable<IEnumerable<string>> ss)
=> (!ss.Skip(1).Any())
? ss.First().Take(take)
: ss.First().Take(take).Zip(
ZipperImpl(ss.Skip(1)),
(x, y) => (x == null || y == null || x != y) ? null : x);
var matching_lines = ZipperImpl(sources).TakeWhile(x => x != null).ToArray();
return matching_lines.Length == take;
}
Now build up your enumerables
:
IEnumerable<string>[] enumerables =
Directory
.EnumerateFiles(folderPath)
.Select(f => File.ReadLines(f))
.ToArray();
Now it's simple to call:
bool headers_match = Zipper(enumerables, 100);
Here's a trace of running this code against three files with more than 4 lines:
Ben Petering at 5:28 PM ACST Ben Petering at 5:28 PM ACST Ben Petering at 5:28 PM ACST From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin. From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin. From a call 2019-05-23, James mentioned he’d like the ability to edit the current shipping price rules (eg in shipping_rules.xml) via the admin. He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30. He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30. He also mentioned he’d like to be able to set different shipping price rules for a given time window, e.g. Jan 1 to Jan 30. These storyishes should be considered when choosing the appropriate module to use. These storyishes should be considered when choosing the appropriate module to use.X These storyishes should be considered when choosing the appropriate module to use.
Note that the enumerations stop when they encountered a mismatch header in the 4th line on the second file. All enumerations then stopped.
回答3:
To the second part of the question. If I get you right this should be sufficient:
static class DisposableHelper
{
public static IEnumerable<TResult> ToDisposable<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TResult> selector) where TResult : IDisposable
{
var exceptions = new List<Exception>();
var result = new List<TResult>();
foreach (var i in source)
{
try { result.Add(selector(i)); }
catch (Exception e) { exceptions.Add(e); }
}
if (exceptions.Count == 0)
return result;
foreach (var i in result)
{
try { i.Dispose(); }
catch (Exception e) { exceptions.Add(e); }
}
throw new AggregateException(exceptions);
}
}
Usage:
private static bool CheckHeaders(string folderPath, int headersCount)
{
var enumerators = Directory.EnumerateFiles(folderPath)
.ToDisposable(f => File.ReadLines(f).GetEnumerator())
.ToArray();
using (new DisposableEnumerable(enumerators))
{
for (int i = 0; i < headersCount; i++)
{
foreach (var e in enumerators)
{
if (!e.MoveNext()) return false;
}
var values = enumerators.Select(e => e.Current);
if (values.Distinct().Count() > 1) return false;
}
return true;
}
}
and
try
{
CheckHeaders(folderPath, headersCount);
}
catch(AggregateException e)
{
// Prompt to fix errors and try again
}
回答4:
Creating an IDisposable
wrapper as @Alex suggested is correct. It needs just a logic to dispose already opened files if some of them is locked and probably some logic for error states. Maybe something like this (error state logic is very simple):
public class HeaderChecker : IDisposable
{
private readonly string _folderPath;
private readonly int _headersCount;
private string _lockedFile;
private readonly List<IEnumerator<string>> _files = new List<IEnumerator<string>>();
public HeaderChecker(string folderPath, int headersCount)
{
_folderPath = folderPath;
_headersCount = headersCount;
}
public string LockedFile => _lockedFile;
public bool CheckFiles()
{
_lockedFile = null;
if (!TryOpenFiles())
{
return false;
}
if (_files.Count == 0)
{
return true; // Not sure what to return here.
}
for (int i = 0; i < _headersCount; i++)
{
if (!_files[0].MoveNext()) return false;
string currentLine = _files[0].Current;
for (int fileIndex = 1; fileIndex < _files.Count; fileIndex++)
{
if (!_files[fileIndex].MoveNext()) return false;
if (_files[fileIndex].Current != currentLine) return false;
}
}
return true;
}
private bool TryOpenFiles()
{
bool result = true;
foreach (string file in Directory.EnumerateFiles(_folderPath))
{
try
{
_files.Add(File.ReadLines(file).GetEnumerator());
}
catch
{
_lockedFile = file;
result = false;
break;
}
}
if (!result)
{
DisposeCore(); // Close already opened files.
}
return result;
}
private void DisposeCore()
{
foreach (var item in _files)
{
try
{
item.Dispose();
}
catch
{
}
}
_files.Clear();
}
public void Dispose()
{
DisposeCore();
}
}
// Usage
using (var checker = new HeaderChecker(folderPath, headersCount))
{
if (!checker.CheckFiles())
{
if (checker.LockedFile is null)
{
// Error while opening files.
}
else
{
// Headers do not match.
}
}
}
I also removed .Select()
and .Distinct()
when checking the lines. The first just iterates over the enumerators
array - the same as foreach
above it, so you are enumerating this array twice. Then creates a new list of lines and .Distinct()
enumerates over it.
来源:https://stackoverflow.com/questions/56309952/make-using-statement-usable-for-multiple-disposable-objects