问题
I am not very familiar with advanced matching patterns in Regex.
I have some Google Search URLs which I need to clean up without having to hold Backspace key for 5 seconds to remove unnecessary parameters from the URL.
Let's say I have this URL(could many different URLs following patterns like below):
https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&q=laravel+crud+generator&oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5
And I want to turn that into nice clean Search URL as below:
https://www.google.com/search?q=laravel+crud+generator
How can I acheive that using Find/Replace with Regex of any of mentioned text editors in Question ?
回答1:
I'm posting that others use the solution.
in notepad++ please press CTRL+H
then select Regular expression on below.
Then place on Find what:
this pattern: .+&(q=[^&]+).+
and in Replace with insert: https://www.google.com/search?$1
Now, easily press the Replace
button for single replace or for all replacements press ALT+A
or Replace All
button.
Check Regex101
But description:
1- .+&
find all characters before &
following a q
. So this part includes https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&
2- (q=[^&]+)
, our target! we want everything after q=
up next &
. So we search for a string which started with q= then any character which is not &
. [^&]
means a character that is not &
and +
is saying that any character that is not &
more than zero time. this part will include q=laravel+crud+generator
. Please notice the parentheses.
3- .+
means any character and includes &oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5
ok, remember ()
in section 2? that was a group. you can use groups in replacements by this pattern $groupNumber
which groupNumber is the index of parentheses. Here we have just one ()
or actually just one group, so our replacement statement will be $1
.
And finally replacement: https://www.google.com/search?$1
so everything is inside group one will replace with $1.
回答2:
Try replacing this pattern: (https://www.google.com/search\?).*(q=[^&]+).*
with $1$2
Explanation:
(https://www.google.com/search\?)
= matches the beginning of your specified string. Notice the escaped?
since it's a special character. Wrapped in parenthesis, this becomes capture group #1 (accessible by$1
).*
= this will match any characters and is also optional. Just to clear out anything between the start of the string and yourq
parameter(q=[^&]+)
= matches yourq
parameter up until the&
symbol (indicating next parameter). Wrapped in parenthesis, this becomes capture group #2 (accessible by$2
).*
= this will match any characters and is also optional. This part clears out anything after yourq
parameter's value
Replacement:
$1$2
= Simply replaces your string with capture group 1 and capture group 2
** Tested in Notepad++ with sample string in question
来源:https://stackoverflow.com/questions/59885566/text-editorsublime-text-geany-notepad-etc-regex-to-remove-all-parameters