I have a string like:
\"super exemple of string key : text I want to keep - end of my string\"
I want to just keep the string which is betw
If you want to handle multiple occurrences of substring pairs, it won't be easy without RegEx:
Regex.Matches(input ?? String.Empty, "(?=key : )(.*)(?<= - )", RegexOptions.Singleline);
input ?? String.Empty
avoids argument null exception?=
keeps 1st substring and?<=
keeps 2nd substringRegexOptions.Singleline
allows newline between substring pair
If order & occurrence count of substrings doesn't matter, this quick & dirty one may be an option:
var parts = input?.Split(new string[] { "key : ", " - " }, StringSplitOptions.None);
string result = parts?.Length >= 3 ? result[1] : input;
At least it avoids most exceptions, by returning the original string if none/single substring match.
or, with a regex.
using System.Text.RegularExpressions;
...
var value =
Regex.Match(
"super exemple of string key : text I want to keep - end of my string",
"key : (.*) - ")
.Groups[1].Value;
with a running example.
You can decide if its overkill.
as an under validated extension method
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var value =
"super exemple of string key : text I want to keep - end of my string"
.Between(
"key : ",
" - ");
Console.WriteLine(value);
}
}
public static class Ext
{
static string Between(this string source, string left, string right)
{
return Regex.Match(
source,
string.Format("{0}(.*){1}", left, right))
.Groups[1].Value;
}
}
When questions are stated in terms of a single example ambiguities are inevitably be present. This question is no exception.
For the example given in the question the desired string is clear:
super example of string key : text I want to keep - end of my string
^^^^^^^^^^^^^^^^^^^
However, this string is but an example of strings and boundary strings for which certain substrings are to be identified. I will consider a generic string with generic boundary strings, represented as follows.
abc FF def PP ghi,PP jkl,FF mno PP pqr FF,stu FF vwx,PP yza
^^^^^^^^^^^^ ^^^^^
PP
is the preceding string, FF
is the following string and the party hats indicate which substrings are to be matched. (In the example given in the question key :
is the preceding string and -
is the following string.) I have assumed that PP
and FF
are preceded and followed by word boundaries (so that PPA
and FF8
are not matched).
My assumptions, as reflected by the party hats, are as follows:
PP
may be preceded by one (or more) FF
substrings, which, if present, are disregarded;PP
is followed by one or more PP
s before FF
is encountered, the following PP
s are part of the substring between the preceding and following strings;PP
is followed by one or more FF
s before a PP
is encounter, the first FF
following PP
is considered to be the following string.Note that many of the answers here deal with only strings of the form
abc PP def FF ghi
^^^^^
or
abc PP def FF ghi PP jkl FF mno
^^^^^ ^^^^^
One may use a regular expression, code constructs, or a combination of the two to identify the substrings of interest. I make no judgement as to which approach is best. I will only present the following regular expression that will match the substrings of interest.
(?<=\bPP\b)(?:(?!\bFF\b).)*(?=\bFF\b)
Start your engine!1
I tested this with the PCRE (PHP) regex engine, but as the regex is not at all exotic, I am sure it will work with the .NET regex engine (which is very robust).
The regex engine performs the following operations:
(?<= : begin a positive lookbehind
\bPP\b : match 'PP'
) : end positive lookbehind
(?: : begin a non-capture group
(?! : begin a negative lookahead
\bFF\b : match 'FF'
) : end negative lookahead
. : match any character
) : end non-capture group
* : execute non-capture group 0+ times
(?= : begin positive lookahead
\bFF\b : match 'FF'
) : end positive lookahead
This technique, of matching one character at a time, following the preceding string, until the character is F
and is followed by F
(or more generally, the character beings the string that constitutes the following string), is called Tempered Greedy Token Solution.
Naturally, the regex would have to be modified (if possible) if the assumptions I set out above are changed.
1. Move the cursor around for detailed explanations.
getStringBetween(startStr, endStr, fullStr) {
string startIndex = fullStr.indexOf(startStr);
string endIndex= fullStr.indexOf(endStr);
return fullStr.substring(startIndex + startStr.length, endIndex);
}
private string gettxtbettwen(string txt, string first, string last)
{
StringBuilder sb = new StringBuilder(txt);
int pos1 = txt.IndexOf(first) + first.Length;
int len = (txt.Length ) - pos1;
string reminder = txt.Substring(pos1, len);
int pos2 = reminder.IndexOf(last) - last.Length +1;
return reminder.Substring(0, pos2);
}
Depending on how robust/flexible you want your implementation to be, this can actually be a bit tricky. Here's the implementation I use:
public static class StringExtensions {
/// <summary>
/// takes a substring between two anchor strings (or the end of the string if that anchor is null)
/// </summary>
/// <param name="this">a string</param>
/// <param name="from">an optional string to search after</param>
/// <param name="until">an optional string to search before</param>
/// <param name="comparison">an optional comparison for the search</param>
/// <returns>a substring based on the search</returns>
public static string Substring(this string @this, string from = null, string until = null, StringComparison comparison = StringComparison.InvariantCulture)
{
var fromLength = (from ?? string.Empty).Length;
var startIndex = !string.IsNullOrEmpty(from)
? @this.IndexOf(from, comparison) + fromLength
: 0;
if (startIndex < fromLength) { throw new ArgumentException("from: Failed to find an instance of the first anchor"); }
var endIndex = !string.IsNullOrEmpty(until)
? @this.IndexOf(until, startIndex, comparison)
: @this.Length;
if (endIndex < 0) { throw new ArgumentException("until: Failed to find an instance of the last anchor"); }
var subString = @this.Substring(startIndex, endIndex - startIndex);
return subString;
}
}
// usage:
var between = "a - to keep x more stuff".Substring(from: "-", until: "x");
// returns " to keep "