Introduction/Question:
I have been studying the use of Regular Expressions (using VBA/Excel), and so far I cannot understand how I would isolate a <
You can explicitly include a white space in your RegEx pattern. The following pattern works just fine
strPattern = "^[0-9](\S)+ "
Since you are trying to find a correspondence with the \p{Zs}
Unicode category class, you might want to also handle all hard spaces. This code will be helpful:
strPattern = "^[0-9](\S)+[ " & ChrW(160) & "]"
Or,
strPattern = "^[0-9](\S+)[ \x0A]"
The [ \x0A]
character class will match either a regular space or a hard, non-breaking space.
If you need to match all kinds of spaces, you can use this regex pattern taken based on the information on https://www.cs.tut.fi/~jkorpela/chars/spaces.html:
strPattern = "^[0-9](\S)+[ \xA0\u1680\u180E\u2000-\u200B\u202F\u205F\u3000\uFEFF]"
This is the table with code point explanations:
U+0020 32 SPACE foo bar Depends on font, typically 1/4 em, often adjusted
U+00A0 160 NO-BREAK SPACE foo bar As a space, but often not adjusted
U+1680 5760 OGHAM SPACE MARK foo bar Unspecified; usually not really a space but a dash
U+180E 6158 MONGOLIAN VOWEL SEPARATOR foobar No width
U+2000 8192 EN QUAD foo bar 1 en (= 1/2 em)
U+2001 8193 EM QUAD foo bar 1 em (nominally, the height of the font)
U+2002 8194 EN SPACE foo bar 1 en (= 1/2 em)
U+2003 8195 EM SPACE foo bar 1 em
U+2004 8196 THREE-PER-EM SPACE foo bar 1/3 em
U+2005 8197 FOUR-PER-EM SPACE foo bar 1/4 em
U+2006 8198 SIX-PER-EM SPACE foo bar 1/6 em
U+2007 8199 FIGURE SPACE foo bar “Tabular width”, the width of digits
U+2008 8200 PUNCTUATION SPACE foo bar The width of a period “.”
U+2009 8201 THIN SPACE foo bar 1/5 em (or sometimes 1/6 em)
U+200A 8202 HAIR SPACE foo bar Narrower than THIN SPACE
U+200B 8203 ZERO WIDTH SPACE foobar Nominally no width, but may expand
U+202F 8239 NARROW NO-BREAK SPACE foo bar Narrower than NO-BREAK SPACE (or SPACE)
U+205F 8287 MEDIUM MATHEMATICAL SPACE foo bar 4/18 em
U+3000 12288 IDEOGRAPHIC SPACE foo bar The width of ideographic (CJK) characters.
U+FEFF 65279 ZERO WIDTH NO-BREAK SPACE
Best regards.
Just use a literal space character: strPattern = "^[0-9](\S)+ "
.