I am reading data files in text format using readLines
. The first \'column\' is complicated text that I do not need. The next columns contain data that I do need.
The following will start at the beginning of the string and then grab everything up to and including the first colon and any additional spaces and replace that with nothing (essentially just removing it)
gsub("^[^:]+:\\s*", "", my.data2)
If you don't want to remove the spaces you could do
gsub("^[^:]+:", "", my.data2)
For some clarification on what the original regular expression is doing. Starting at the beginning:
^
this says to only find matches at the start of the string
[^:]
this represents any character that is not a colon
+
this says to match the preceding character one or more times (so match as many non-colon characters as possible)
:
this is what actually matches the colon
\\s
this matches a space
*
this says to match the preceding character zero or more times (so we remove any additional space after the colon)
So putting it all together we start at the beginning of the string then match as many non-colon characters as possible then grab the first colon character and any additional spaces and replace all of that with nothing (essentially removing all of the junk we don't want).