I have an Office 2007 .XLSX
file containing over 5000 records like the below (a single cell with multiple rows of text). The issue: on an adjacent cell, put count
You can also include newlines in the Pattern expression by using \n. this way, you don't have to split the text in an array:
Private Function String_CountRegExp_Debug()
'Input of the test text
Dim TestText As String
TestText = "1/15/2013 1:30:11 AM Userx" & vbNewLine & _
"Had to reboot system" & vbNewLine & _
"1/15/2013 1:32:11 AM Userx" & vbNewLine & _
"System running finished rebooting and appears to be working" & vbNewLine & _
"11/15/2013 12:30:11 AM Userx" & vbNewLine & _
"System hung again"
'Input of the Pattern
Dim RE_Pattern As String
RE_Pattern = "(\d{1,2})\/(\d{1,2})\/(\d{4})\s(\d{1,2}):(\d{1,2}):(\d{1,2})\s([A,P]M).*\n"
Debug.Print String_CountRegExp(TestText, RE_Pattern)
End Function
Public Function String_CountRegExp(Text As String, Pattern As String) As Long
'Count the number of Pattern matches in a string.
'Set up regular expression object
Dim RE As New RegExp
RE.Pattern = Pattern
RE.Global = True
RE.IgnoreCase = True
RE.MultiLine = True
'Retrieve all matches
Dim Matches As MatchCollection
Set Matches = RE.Execute(Text)
'Return the corrected count of matches
String_CountRegExp = Matches.Count
End Function
Below is the VBA code of a function that takes in a string value and returns count of matches to the regular expression. I hope it is of use for someone.
Function CountOfDateValues(thetext)
Dim data() As String
Dim yourInput As String
yourInput = thetext
Dim TheSplitter As String
TheSplitter = Chr(10) 'the character that represents a line break
data = Split(yourInput, TheSplitter ) ' creates an array of strings for each line in the cell
Dim re
Set re = CreateObject("VBscript.regexp")
'regular expression that matches ##/##/#### ##:##:## ##
re.Pattern = "(?=\d)^(?:(?!(?:10\D(?:0?[5-9]|1[0-4])\D(?:1582))|(?:0?9\D(?:0?[3-9]|1[0-3])\D(?:1752)))((?:0?[13578]|1[02])|(?:0?[469]|11)(?!\/31)(?!-31)(?!\.31)|(?:0?2(?=.?(?:(?:29.(?!000[04]|(?:(?:1[^0-6]|[2468][^048]|[3579][^26])00))(?:(?:(?:\d\d)(?:[02468][048]|[13579][26])(?!\x20BC))|(?:00(?:42|3[0369]|2[147]|1[258]|09)\x20BC))))))|(?:0?2(?=.(?:(?:\d\D)|(?:[01]\d)|(?:2[0-8])))))([-.\/])(0?[1-9]|[12]\d|3[01])\2(?!0000)((?=(?:00(?:4[0-5]|[0-3]?\d)\x20BC)|(?:\d{4}(?!\x20BC)))\d{4}(?:\x20BC)?)(?:$|(?=\x20\d)\x20))?((?:(?:0?[1-9]|1[012])(?::[0-5]\d){0,2}(?:\x20[aApP][mM]))|(?:[01]\d|2[0-3])(?::[0-5]\d){1,2})?$"
re.Global = True
Dim t As String
Dim theCount As Integer
theCount = 0
For i = LBound(data) To UBound(data) 'from first item in array to last item in array
For Each Match In re.Execute(Left(data(i), InStrRev(data(i), ":") + 5))
'from start of string to 5 characters past the last ':' of string
theCount = theCount + 1
Next
Next
CountOfDateValues = theCount
End Function
Referencing urls:
MS Access 2003 VBA String Split on Line Break
http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/
This tool made testing my regular expression against various date formats remarkably easy.
http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2
I was able to save a lot of time crafting a regular expression by using a precrafted one from here. Sadly did not learn much by do1ing so, but I believe I saved a lot of time on this 'we need it done now' request.
*Note: There is a window for a false positives if someone starts their worklog note with a timetamp, I noted this to the customer and they were fine with it.