I\'m trying to validate a query string with regex. Note that I\'m not trying to match out the values, but validate its syntax. I\'m doing this to practice regex, so I\'d app
I made this.
function isValidURL(url) {
// based off https://mathiasbynens.be/demo/url-regex. testing https://regex101.com/r/pyrDTK/2
var pattern = /^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:\/?)(?:(?:\?(?:(?!&|\?)(?:\S))+=(?:(?!&|\?)(?:\S))+)(?:&(?:(?!&|\?)(?:\S))+=(?:(?!&|\?)(?:\S))+)*)?$/iuS;
return pattern.test(url);
}
Base: https://mathiasbynens.be/demo/url-regex
Testing: https://regex101.com/r/pyrDTK/4/
When you need to validate a very complex url, you may use this regex
`^(https|ftp|http|ftps):\/\/([a-z\d_]+\.)?(([a-zA-Z\d_]+)(\.[a-zA-Z]{2,6}))(\/[a-zA-Z\d_\%\-=\+]+)*(\?)?([a-zA-Z\d=_\+\%\-&\{\}\:]+)?`
This might not be a job for regexes, but for existing tools in your language of choice. Regexes are not a magic wand you wave at every problem that happens to involve strings. You probably want to use existing code that has already been written, tested, and debugged.
In PHP, use the parse_url function.
Perl: URI module.
Ruby: URI module.
.NET: 'Uri' class
This seems to be what you want:
^\?([\w-]+(=[\w-]*)?(&[\w-]+(=[\w-]*)?)*)?$
See live demo
This considers each "pair" as a key followed by an optional value (which maybe blank), and has a first pair, followed by an optional &
then another pair,and the whole expression (except for the leading?
) is optional. Doing it this way prevents matching ?&abc=def
Also note that hyphen doesn't need escaping when last in the character class, allowing a slight simplification.
You seem to want to allow hyphens anywhere in keys or values. If keys need to be hyphen free:
^\?(\w+(=[\w-]*)?(&\w+(=[\w-]*)?)*)?$
You can use this regex:
^\?([^=]+=[^=]+&)+[^=]+(=[^=]+)?$
What it does is:
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\? '?'
--------------------------------------------------------------------------------
( group and capture to \1 (1 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[^=]+ any character except: '=' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
= '='
--------------------------------------------------------------------------------
[^=]+ any character except: '=' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
& '&'
--------------------------------------------------------------------------------
)+ end of \1 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \1)
--------------------------------------------------------------------------------
[^=]+ any character except: '=' (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \2 (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
= '='
--------------------------------------------------------------------------------
[^=]+ any character except: '=' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
)? end of \2 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \2)
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
I agree with Andy Lester, but a possible regex solution is
#^\?([\w-]+=[\w-]*(&[\w-]+=[\w-]*))?$#
which is very much like what you posted.
I haven't tested it and you didn't say what language you're using so it may need a little tweaking.