Is there a means to get the index of the first non-whitespace character in a string (or more generally, the index of the first character matching a condition) in C# without
I like to define my own extension method for returning the index of the first element that satisfies a custom predicate in a sequence.
/// <summary>
/// Returns the index of the first element in the sequence
/// that satisfies a condition.
/// </summary>
/// <typeparam name="TSource">
/// The type of the elements of <paramref name="source"/>.
/// </typeparam>
/// <param name="source">
/// An <see cref="IEnumerable{T}"/> that contains
/// the elements to apply the predicate to.
/// </param>
/// <param name="predicate">
/// A function to test each element for a condition.
/// </param>
/// <returns>
/// The zero-based index position of the first element of <paramref name="source"/>
/// for which <paramref name="predicate"/> returns <see langword="true"/>;
/// or -1 if <paramref name="source"/> is empty
/// or no element satisfies the condition.
/// </returns>
public static int IndexOf<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
int i = 0;
foreach (TSource element in source)
{
if (predicate(element))
return i;
i++;
}
return -1;
}
You could then use LINQ to address your original problem:
string str = " Hello World";
int i = str.IndexOf<char>(c => !char.IsWhiteSpace(c));
A string
is of course an IEnumerable<char>
so you can use Linq:
int offset = someString.TakeWhile(c => char.IsWhiteSpace(c)).Count();
string s= " \t Test";
Array.FindIndex(s.ToCharArray(), x => !char.IsWhiteSpace(x));
returns 6
To add a condition just do ...
Array.FindIndex(s.ToCharArray(), x => !char.IsWhiteSpace(x) && your condition);
Something is going to be looping somewhere. For full control over what is and isn't whitespace you could use linq to objects to do your loop:
int index = Array.FindIndex(
s.ToCharArray(),
x => !(new [] { '\t', '\r', '\n', ' '}.Any(c => c == x)));
There are a lot of solutions here that convert the string to an array. That is not necessary, individual characters in a string can be accessed just as items in an array.
This is my solution that should be very efficient:
private static int FirstNonMatch(string s, Func<char, bool> predicate, int startPosition = 0)
{
for (var i = startPosition; i < s.Length; i++)
if (!predicate(s[i])) return i;
return -1;
}
private static int LastNonMatch(string s, Func<char, bool> predicate, int startPosition)
{
for (var i = startPosition; i >= 0; i--)
if (!predicate(s[i])) return i;
return -1;
}
And to use these, do the following:
var x = FirstNonMatch(" asdf ", char.IsWhiteSpace);
var y = LastNonMatch(" asdf ", char.IsWhiteSpace, " asdf ".Length);
var match = Regex.Match(" \t test ", @"\S"); // \S means all characters that are not whitespace
if (match.Success)
{
int index = match.Index;
//do something with index
}
else
{
//there were no non-whitespace characters, handle appropriately
}
If you'll be doing this often, for performance reasons you should cache the compiled Regex
for this pattern, e.g.:
static readonly Regex nonWhitespace = new Regex(@"\S");
Then use it like:
nonWhitespace.Match(" \t test ");