I have a string of words separated by spaces. How to split the string into lists of words based on the words length?
Example
inpu
Edit: I'm glad my original answer helped the OP solve their problem. However, after pondering the problem a bit, I've adapted it (and I strongly advise against my former solution, which I have left at the end of the post).
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa ";
var words = input.Trim().Split().Distinct();
var lookup = words.ToLookup(word => word.Length);
First, we trim the input to avoid empty elements from the outer spaces. Then, we split the string into an array. If multiple spaces occur in between the words, you'd need to use StringSplitOptions
as as in Mark's answer.
After calling Distinct
to only include each word once, we now convert words
from IEnumerable<string>
to Lookup<int, string>
, where the words' length is represented by the key (int)
and the words themselves are stored in the value (string)
.
Hang on, how is that even possible? Don't we have multiple words for each key? Sure, but that's exactly what the Lookup class is there for:
Lookup<TKey, TElement>
represents a collection of keys each mapped to one or more values. ALookup<TKey, TElement>
resembles aDictionary<TKey, TValue>
. The difference is that a Dictionary maps keys to single values, whereas a Lookup maps keys to collections of values.You can create an instance of a
Lookup
by callingToLookup
on an object that implementsIEnumerable<T>
.
Note
There is no public constructor to create a new instance of a Lookup. Additionally, Lookup objects are immutable, that is, you cannot add or remove elements or keys from a Lookup after it has been created.
word => word.Length
is the KeySelector lambda: it defines that we want to index (or group, if you will) the Lookup
by the Length of the words.
(similar to the question's originally requested output)
foreach (var grouping in lookup)
{
Console.WriteLine("{0}: {1}", grouping.Key, string.Join(", ", grouping));
}
Output
2: aa, bb, cc 3: aaa, bbb, ccc 4: aaaa, bbbb, cccc
List
List<String> list3 = lookup[3].ToList();
(note that these will return IOrderedEnumerable<T>
, so access by key is no longer possible)
var orderedAscending = lookup.OrderBy(grouping => grouping.Key);
var orderedDescending = lookup.OrderByDescending(grouping => grouping.Key);
Original answer - please don't do this (bad performance, code clutter):
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc cccc bbb bb aa ";
Dictionary<int, string[]> results = new Dictionary<int, string[]>();
var grouped = input.Trim().Split().Distinct().GroupBy(s => s.Length)
.OrderBy(g => g.Key); // or: OrderByDescending(g => g.Key);
foreach (var grouping in grouped)
{
results.Add(grouping.Key, grouping.ToArray());
}
You can use Linq GroupBy
edit Now I applied Linq to generate the string list you wanted for output.
edit2 applied multiple input, single output as in edited question. It is just a Distinct call in Linq
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";
var list = input.Split(' ');
var grouped = list.GroupBy(s => s.Length);
foreach (var elem in grouped)
{
string header = "List " + elem.Key + ": ";
// var line = elem.Aggregate((workingSentence, next) => next + ", " + workingSentence);
// if you want single items, use this
var line = elem.Distinct().Aggregate((workingSentence, next) => next + ", " + workingSentence);
string full = header + " " + line;
Console.WriteLine(full);
}
// output: please note the last blank in the input string! this generates the 0 list
List 0: ,
List 2: cc, bb, aa
List 3: ccc, bbb, aaa
List 4: cccc, bbbb, aaaa
You can use Where
to find elements that match a predicate (in this case, having the correct length):
string[] words = input.Split();
List<string> twos = words.Where(s => s.Length == 2).ToList();
List<string> threes = words.Where(s => s.Length == 3).ToList();
List<string> fours = words.Where(s => s.Length == 4).ToList();
Alternatively you could use GroupBy
to find all the groups at once:
var groups = words.GroupBy(s => s.Length);
You can also use ToLookup
so that you can easily index to find all the words of a specific length:
var lookup = words.ToLookup(s => s.Length);
foreach (var word in lookup[3])
{
Console.WriteLine(word);
}
Result:
aaa bbb ccc
See it working online: ideone
In your update it looks like you want to remove the empty strings and duplicated words. You can do the former by using StringSplitOptions.RemoveEmptyEntries
and the latter by using Distinct
.
var words = input.Split((char[])null, StringSplitOptions.RemoveEmptyEntries)
.Distinct();
var lookup = words.ToLookup(s => s.Length);
Output:
aa, bb, cc
aaa, bbb, ccc
aaaa, bbbb, cccc
See it working online: ideone
A bit lengthy solution but does get the result in a Dictionary
class Program
{
public static void Main()
{
Print();
Console.ReadKey();
}
private static void Print()
{
GetListOfWordsByLength();
foreach (var list in WordSortedDictionary)
{
list.Value.ForEach(i => { Console.Write(i + ","); });
Console.WriteLine();
}
}
private static void GetListOfWordsByLength()
{
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";
string[] inputSplitted = input.Split(' ');
inputSplitted.ToList().ForEach(AddToList);
}
static readonly SortedDictionary<int, List<string>> WordSortedDictionary = new SortedDictionary<int, List<string>>();
private static void AddToList(string s)
{
if (s.Length > 0)
{
if (WordSortedDictionary.ContainsKey(s.Length))
{
List<string> list = WordSortedDictionary[s.Length];
list.Add(s);
}
else
{
WordSortedDictionary.Add(s.Length, new List<string> {s});
}
}
}
}
First, let's declare a class that can hold a length as well as a list of words
public class WordList
{
public int WordLength { get; set; }
public List<string> Words { get; set; }
}
Now, we can build a list of word lists with
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";
string[] words = input.Trim().Split();
List<WordList> list = words
.GroupBy(w => w.Length)
.OrderBy(group => group.Key)
.Select(group => new WordList {
WordLength = group.Key,
Words = group.Distinct().OrderBy(s => s).ToList()
})
.ToList();
The lists are sorted by length and aphabetically respectively.
Result
e.g.
list[2].WordLength ==> 4
list[2].Words[1] ==> "bbbb"
If you want, you can process the result immediately, instead of putting it into a data structure
string input = " aa aaa aaaa bb bbb bbbb cc ccc cccc ";
var query = input
.Trim()
.Split()
.GroupBy(w => w.Length)
.OrderBy(group => group.Key);
// Process the result here
foreach (var group in query) {
// group.Key ==> length of words
foreach (string word in group.Distinct().OrderBy(w => w)) {
...
}
}