/url?q=http://it.wikipedia.org/wiki/Spider-Man_(film)&sa=U&ei=iavVUKuFGsrNswbz74GQBA&ved=0CBYQFjAA&usg=AFQjCNEth5YspFPWp6CInyAfknlEvVgIfA
You need to use reluctant matching to match till the first &
. With greedy matching (i.e. using *
instead of *?
), your pattern will match as long string as possible so as to satisfy the complete pattern.
So use this: -
\?q=(.*?)&
Or you can also use character class with negated &
which matches every character except &
: -
\?q=([^&]*)
Note that, if you don't want your (.*?)
to match empty string, then you should use +
quantifier. It matches 1 or more
occurrence.
If your in python then sub(r'(\/url\?q\=)|[&][\S]*','',url)
should do your work
You just need to make the *
operator lazy, and you do it by adding a ?
after it. So it would be .*?
Lazy (or non greedy) means that will stop after the first occurrence of that match, instead of the last one.
Try:
\?q=([^&]+)
and capture the first group.