I have written the below linq statement. But it takes huge time to process since there are so many lines. My cpu has 8 cores but only using 1 core due to running single thread.
There are rooms for performance improvements before resorting to AsParallel
HashSet lstAllLines = new HashSet(
File.ReadAllLines("AllLines.txt")
.SelectMany(ls => ls.ToLowerInvariant().Split(' ')));
List lstBannedWords = File.ReadAllLines("allBaddWords.txt")
.Select(s => s.ToLowerInvariant())
.Distinct().ToList();
List lstFoundBannedWords = lstBannedWords.Where(s => lstAllLines.Contains(s))
.Distinct().ToList();
Since access to HasSet is O(1)
and lstBannedWords
is the shorter list, You may even not need any parallelism (TotalSearchTime=lstBannedWords.Count*O(1)
). Lastly, you always have the option AsParallel