问题
I'm trying to use regexMatcher from String Manipulation in KNIME but it doesn't work. I'm writing the following: regexMatcher($Document$,"/\w") when I want to extract all sentences that have /s or /p or w/p or /200. However even though I have such cases in my table nothing is retrieved. I will appreciate your help.
回答1:
I got the following:
|Document |isOK |other|strict|
|--------------|-----|-----|------|
|Some /p with q|True |False|False |
|/200 |True |True |False |
|/p |True |True |True |
|/s |True |True |True |
|w/p |True |False|False |
|no slash |False|False|False |
For the expressions:
- isOK:
regexMatcher($Document$, ".*?/\\w.*")
(I guess this is what you are after.) - other:
regexMatcher($Document$, "/\\w.*")
- strict:
regexMatcher($Document$, "/\\w")
(Document contains no content after the last visible character.)
The problem you might run into is the escaping for the string manipulator node and the semantics of regexMatcher
.
The String literal within there is just a Java String, so you have to escape the \
(and some other characters), so it becomes \\
.
The semantics of regexMatcher
is to match the whole String, so you have to add .*?
(non-greedy match anything) before the value you are looking for and .*
(greedy match anything) after the expression you are looking for.
(Obviously if I misunderstood your question, the semantics is probably already is what you want.)
BTW: in case you want to filter, you should check the Rule-based Row Filter node as it offers an option to directly filter by regex. It uses a different escaping rule (for the isOK option):
$Document$ MATCHES ".*?/\w.*" => TRUE
(escaping is not allowed within quotes)$Document$ MATCHES /.*?\/\\w.*/ => TRUE
(escaping is allowed within slashes (and/
,\
are need to be escaped, but"
is not required))
来源:https://stackoverflow.com/questions/39857739/regexmatcher-in-string-manipulation-knime