问题
I'm writing a script that looks through my outgoing emails and searches for frequent stylistic errors I make. It locates them using regex and then highlights them yellow. Code:
Public Sub highlightBadForm()
Dim oWordDoc As Object
Dim oMatches As Object
Dim oRange As Range
Dim strText As String
Dim lngFindFrom As Long
Dim varMtch As Variant
Set oWordDoc = Application.ActiveInspector.WordEditor
strText = LCase(oWordDoc.Range.Text)
lngFindFrom = InStr(strText, "from: ")
If lngFindFrom > 0 Then
strText = Left(strText, lngFindFrom - 1)
End If
Set oMatches = extractMatches(strText, getBadStrs)
If Not oMatches Is Nothing Then
For Each varMtch In oMatches
Set oRange = oWordDoc.Range(varMtch.firstindex, varMtch.firstindex + varMtch.Length)
oRange.HighlightColorIndex = wdYellow
Next varMtch
End If
Set oRange = Nothing
Set oWordDoc = Nothing
Set oMatches = Nothing
End Sub
extractMatches
is a private function implementing VBA's RegEx engine. getBadStrs
returns the regular expression containing the errors.
It all works unless I've embedded hyperlinks in my email. If so, oWordDoc.Range.Text
returns only the anchor text of the links, not the links (and any other characters Word pads the hyperlinks with - I don't know what they might be). As a result, varMtch.firstindex
is correct for strText
but not oRange
so the text it highlights is offset by several characters.
I tried to assemble the full oRange
text by looping through the hyperlinks in oRange
and adding the link text to the string assuming it would be included in oRange
. Something like:
Dim lngEndLnk as Long
Set oRange = oWordDoc.Range
For Each varMtch In oRange.Hyperlinks
strText = strText & oWordDoc.Range(lngEndLnk, varMtch.Range.Start)
strText = strText & varMtch.TextToDisplay & varMtch.Name
lngEndLnk = varMtch.Range.End
Next varMtch
If lngEndLnk = 0 Then
strText = oRange.text
Else
strText = strText & oWordDoc.Range(lngEndLnk, oWordDoc.Range.End)
End If
That reduced the offset, but there still is one. Also, if I were to include a linked image in the email, the .Anchor
property of varMtch
fails so I'd have to come up with another workaround.
Is there a more straightforward way to get a String
containing all the characters of the Range
object so the regex indices line up?
回答1:
You can access the hyperlink address using the hyperlinks
collection of a document:
Private Sub CommandButton1_Click()
strtext = ActiveDocument.Range.Text
MsgBox (strtext)
For Each hLink In Documents(1).Hyperlinks
MsgBox (hLink.Address)
Next hLink
End Sub
This first displays all the text in a document, and then loops through each hyperlink displaying its URL.
This can then be used through your RegEx.
For more information and examples, see hyperlinks.
回答2:
I ended up with a similar solution to @slightly snarky. I don't know that it's better so I won't mark it as the solution. Happy for comments on pros and cons, in case there's a clear winner I'm just not seeing.
Personally, I like looping the character collection and probably should use it in my code, this works. I find using the position array to highlight matches much less intuitive than constructing a string from the range. For my purposes padding the string with #
in place of the zero-length characters in oWordDoc.Range
works, but I also know it won't work for everybody.
Public Sub highlightBadForm()
Dim oWordDoc As Object
Dim oMatches As Object
Dim oRange As Range
Dim strText As String
Dim lngFindFrom As Long, lngC As Long, lngPrevLen As Long
Dim varMtch As Variant
Set oWordDoc = Application.ActiveInspector.WordEditor
For lngC = 0 To oWordDoc.Range.End - 1
strText = strText & oWordDoc.Range(lngC, lngC + 1)
If Len(strText) = lngPrevLen Then
strText = strText & "#"
End If
lngPrevLen = lngPrevLen + 1
Next lngC
strText = LCase(strText)
lngFindFrom = InStr(strText, "from: ")
If lngFindFrom > 0 Then
strText = Left(strText, lngFindFrom - 1)
End If
Set oMatches = extractMatches(strText, getBadStrs)
If Not oMatches Is Nothing Then
For Each varMtch In oMatches
Set oRange = oWordDoc.Range(varMtch.FirstIndex, varMtch.FirstIndex + varMtch.Length)
oRange.HighlightColorIndex = wdYellow
Next varMtch
End If
Set oRange = Nothing
Set oWordDoc = Nothing
Set oMatches = Nothing
End Sub
回答3:
The key to this seems to be that when you iterate through a Range looking at each "position" in the range, e.g. via something like
With ActiveDocument.Range
For i = 0 to .End - 1
Debug.Print i,Ascw(.Range(i,i+1).Text)
Next
End With
The Range does contain all the characters in the code of a field such as HYPERLINK field, and all the characters in its result (which might be displayed or it might be hidden text). But in some cases a Range may contain additional characters which are never displayed. For example, if you have a field code such as { SET x 123 } then the Range contains what are in effect the field braces and the code " SET X 123 ", but before the field end brace it also contains a marker followed by the value "123". But the SET field does not display its result.
That makes it difficult to construct a "find" string that's the same length as the Range.
But Range.Text is the same text as the concatenation of all the characters in Range.Characters, and each Character in that Collection is a Range that contains the .Start position
So that lets us get the .Start and .End as the following example shows.
This assumes you are working with the ActiveDocument in Word, and have some text, a HYPERLINK field (say), and possibly other fields, with the text "test1" in various places.
I haven't done much testing so it may still need tweaking.
Sub HighlightFinds()
Dim match As VBScript_RegExp_55.match
Dim matches As VBScript_RegExp_55.MatchCollection
Dim rng1 As Word.Range
Dim rng2 As Word.Range
Set rng1 = ActiveDocument.Content
Set rng2 = ActiveDocument.Content ' or rng1.Duplicate
' When you do this, rng1.Text returns the text of the field *codes* but
' not the field *results*, and so does rng1.Characters
'rng1.TextRetrievalMode.IncludeFieldCodes = True
' when you do this, it returns the *results* but not the *codes*
rng1.TextRetrievalMode.IncludeFieldCodes = False
' You could do both, one after the other, to try to get all the matches
' You might also need to set .TextRetrievalMode.IncludeHiddenText
With New VBScript_RegExp_55.RegExp
.Pattern = "test1"
.Global = True
Set matches = .Execute(rng1.Text)
End With
For Each match In matches
rng2.SetRange rng1.Characters(match.FirstIndex + 1).Start, rng1.Characters(match.FirstIndex + 1 + match.Length).End
rng2.HighlightColorIndex = wdYellow
Next
Set matches = Nothing
Set rng2 = Nothing
Set rng1 = Nothing
End Sub
来源:https://stackoverflow.com/questions/62040233/get-full-range-text-to-string