问题
I'm trying to remove the content of all cells that start with a character that is not a number using KNIME (v3.2.1). I have different ideas but nothing works.
1) String Manipulation Node: regexReplace(§column§,"^[^0-9].*","")
The cells contain multiple lines, however only the first line is removed by this approach.
2) String Manipulation Node: regexMatcher($casrn_new$,"^[^0-9].*")
followed by Rule Engine Node to remove all columns that are "TRUE".
The regexMatcher gives me "False" even for columns that should be "True" though.
3) String Replacer Node: I inserted the expression ^[^0-9].*
into the Pattern column and selected "Replace whole String" but the regex is not recognised by that node so nothing gets replaced.
Does anyone have a solution for any of those approaches or knows another Node that might do the job? Help is much appreciated!
回答1:
I would go with your first solution, since it has already worked, you just have to expand your regex to include newlines. I would try something like this:
regexReplace($column$,"^[^0-9].(.|\n)*","")
This should match any text starting with a character that is not a number, followed by any number of occurrences of any character or a newline. Depending on the line endings, you might need (.|\n|\r)
instead of (.|\n)
.
回答2:
You should use the following expression:
"(?s)^\D.*$"
So the dot will match even new lines. (Based on this: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html#DOTALL)
In case you need to only change the content of the cells that do not start with a number, I do not think you need to filter any columns or rows. (BTW in case you want to remove rows, there are the Rule-based Row Filter/Splitter nodes which also support regular expressions with the MATCHES predicate.)
来源:https://stackoverflow.com/questions/40003509/regexreplace-in-string-manipulation-knime