Maybe a basic question but let us say I have a string that is 2000 characters long, I need to split this string into max 512 character chunks each.
Is there a nice way,
Something like this:
private IList<string> SplitIntoChunks(string text, int chunkSize)
{
List<string> chunks = new List<string>();
int offset = 0;
while (offset < text.Length)
{
int size = Math.Min(chunkSize, text.Length - offset);
chunks.Add(text.Substring(offset, size));
offset += size;
}
return chunks;
}
Or just to iterate over:
private IEnumerable<string> SplitIntoChunks(string text, int chunkSize)
{
int offset = 0;
while (offset < text.Length)
{
int size = Math.Min(chunkSize, text.Length - offset);
yield return text.Substring(offset, size);
offset += size;
}
}
Note that this splits into chunks of UTF-16 code units, which isn't quite the same as splitting into chunks of Unicode code points, which in turn may not be the same as splitting into chunks of glyphs.
using Jon's implementation and the yield keyword.
IEnumerable<string> Chunks(string text, int chunkSize)
{
for (int offset = 0; offset < text.Length; offset += chunkSize)
{
int size = Math.Min(chunkSize, text.Length - offset);
yield return text.Substring(offset, size);
}
}
Most of the answer may have the same flaw. Given an empty text they will yield nothing. We (I) expect at least to get back that empty string (same behaviour as a split on a char not in the string, which will give back one item : that given string)
so we should loop at least once all times (based on Jon's code) :
IEnumerable<string> SplitIntoChunks (string text, int chunkSize)
{
int offset = 0;
do
{
int size = Math.Min (chunkSize, text.Length - offset);
yield return text.Substring (offset, size);
offset += size;
} while (offset < text.Length);
}
or using a for (Edited : after toying a little more with this, I found a better way to handle the case chunkSize greater than text) :
IEnumerable<string> SplitIntoChunks (string text, int chunkSize)
{
if (text.Length <= chunkSize)
yield return text;
else
{
var chunkCount = text.Length / chunkSize;
var remainingSize = text.Length % chunkSize;
for (var offset = 0; offset < chunkCount; ++offset)
yield return text.Substring (offset * chunkSize, chunkSize);
// yield remaining text if any
if (remainingSize != 0)
yield return text.Substring (chunkCount * chunkSize, remainingSize);
}
}
That could also be used with the do/while loop ;)
Something like?
Calculate eachLength = StringLength / WantedCharLength
Then for (int i = 0; i < StringLength; i += eachLength)
SubString (i, eachLength);
Generic extension method:
using System;
using System.Collections.Generic;
using System.Linq;
public static class IEnumerableExtensions
{
public static IEnumerable<IEnumerable<T>> SplitToChunks<T> (this IEnumerable<T> coll, int chunkSize)
{
int skipCount = 0;
while (coll.Skip (skipCount).Take (chunkSize) is IEnumerable<T> part && part.Any ())
{
skipCount += chunkSize;
yield return part;
}
}
}
class Program
{
static void Main (string[] args)
{
var col = Enumerable.Range(1,1<<10);
var chunks = col.SplitToChunks(8);
foreach (var c in chunks.Take (200))
{
Console.WriteLine (string.Join (" ", c.Select (n => n.ToString ("X4"))));
}
Console.WriteLine ();
Console.WriteLine ();
"Split this text into parts that are fifteen characters in length, surrounding each part with single quotes and output each into the console on seperate lines."
.SplitToChunks (15)
.Select(p => $"'{string.Concat(p)}'")
.ToList ()
.ForEach (p => Console.WriteLine (p));
Console.ReadLine ();
}
}
Though this question meanwhile has an accepted answer, here's a short version with the help of regular expressions. Purists may not like it (understandably) but when you need a quick solution and you are handy with regexes, this can be it. Performance is rather good, surprisingly:
string [] split = Regex.Split(yourString, @"(?<=\G.{512})");
What it does? Negative look-backward and remembering the last position with \G
. It will also catch the last bit, even if it isn't dividable by 512.