regular expression - match word only once in line

别说谁变了你拦得住时间么 提交于 2019-11-28 07:37:17

问题


Case:

  1. ehello goodbye hellot hello goodbye
  2. ehello goodbye hello hello goodbye

I want to match line 1 (only has 'hello' once!) DO NOT want to match line 2 (contains 'hello' more than once)

Tried using negative look ahead look behind and what not... without any real success..


回答1:


A simple option is this (using the multiline flag and not dot-all):

^(?!.*\bhello\b.*\bhello\b).*\bhello\b.*$

First, check you don't have 'hello' twice, and then check you have it at least once.
There are other ways to check for the same thing, but I think this one is pretty simple.

Of course, you can simple match for \bhello\b and count the number of matches...




回答2:


A generic regex would be:

^(?:\b(\w+)\b\W*(?!.*?\b\1\b))*\z

Altho it could be cleaner to invert the result of this match:

\b(\w+)\b(?=.*?\b\1\b)

This works by matching a word and capturing it, then making sure with a lookahead and a backreference that it does/not follow anywhere in the string.




回答3:


Since you're only worried about words (ie tokens separated by whitespace), you can just split on spaces and see how often "hello" appears. Since you didn't mention a language, here's an implementation in Perl:

use strict;
use warnings;

my $a1="ehello goodbye hellot hello goodbye";
my $a2="ehello goodbye hello hello goodbye";

my @arr1=split(/\s+/,$a1);
my @arr2=split(/\s+/,$a2);

#grab the number of times that "hello" appears

my $num_hello1=scalar(grep{$_ eq "hello"}@arr1);
my $num_hello2=scalar(grep{$_ eq "hello"}@arr2);

print "$num_hello1, $num_hello2\n";

The output is

1, 2


来源:https://stackoverflow.com/questions/8764861/regular-expression-match-word-only-once-in-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!