I really have serious problems with regex. I need to get all text between 2 strings, in this case that strings are <
You can also do it with XML:
Dim s As String = "<span class=""user user-role-registered-member"">Keyboard</span>"
Dim doc As New System.Xml.XmlDocument
doc.LoadXml(s)
Console.WriteLine(doc.FirstChild.InnerText) ' Outputs: "Keyboard"
There are reasons given for not trying to parse HTML with regexes at RegEx match open tags except XHTML self-contained tags.
Use Explicit capture groups. The following should do the job:
Dim exp = "<span class=""user user-role-registered-member"">(?<GRP>.*)</span>"
Dim M = System.Text.RegularExpressions.Regex.Match(YourInputString, exp, System.Text.RegularExpressions.RegexOptions.ExplicitCapture)
If M.Groups("GRP").Value <> "" Then
Return M.Groups("GRP").Value
End If
Your text is xml
, so why to hack a strings with Regex
if you can do it in readable and clear way.
With LINQ to XML
Dim htmlPage As XDocument = XDocument.Parse(downloadedHtmlPage)
Dim className As String = "user user-role-registered-member"
Dim value As String =
htmlPage.Descendants("span").
Where(Function(span) span.Attribute("class").Value.Equals(className)).
FirstOrDefault().
Value
And with Accessing XML in Visual Basic
Dim htmlPage As XDocument = XDocument.Parse(downloadedHtmlPage)
Dim className As String = "user user-role-registered-member"
Dim value As String =
htmlPage...<span>.
Where(Function(span) span.@class.Value.Equals(className)).
FirstOrDefault().
Value
Thank you very much for answers. I found answer by myself (thanks to Evil Tak i got an idea).
Dim findtext As String = "(?<=<span class=""user user-role-registered-member"">)(.*?)(?=</span>)"
Dim myregex As String = "<span class=""user user-role-registered-member"">Keyboard</span>"
Dim doregex As MatchCollection = Regex.Matches(myregex, findtext)
MsgBox(doregex(0).ToString)
StackOverFlow is so powerful...♥
This does the job easily and beautifully. It won't return a match when there is no text inside the span, so you do not need to worry about empty matches. It will however return groups with only whitespace in them.
<span class=""user user-role-registered-member"">(.+)</span>
Test it out here.