What happens if there is a illegal character? Does the URL fix it self by encoding the illegal characters into something else?
As explained here
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=. Any other character needs to be encoded with the percent-encoding (%hh). Each part of the URI has further restrictions about what characters need to be represented by an percent-encoded word.
RFC 3986 defines which characters are allowed in which URI components.
RFCs for specific URI schemes might further restrict this.
If you are interested in HTTP/HTTPS URIs: they are defined in RFC 7230. AFAIK they don’t have further restrictions regarding allowed characters, so you could stick to the definitions in RFC 3986.
Depends on many factors … could be anything from "nothing happens" to "doesn’t work anymore".
Does the URL fix it self by encoding the illegal characters into something else?
A URI can’t fix itself, it’s just a string.
Clients working with this URI (browser, server, email client, etc.) may try to fix a URI (or work with invalid URIs) according to their own rules.
Also note that there’s a difference between a URI and linking to (or storing etc.) this URI in a document.
The host language (e.g., HTML) might have rules what to encode. This does not change the URI, only the way the URI is stored/specified in this document.
For example, the valid URI http://example.com/a&b
would have to be linked like this in HTML documents:
<a href="http://example.com/a&b">Link</a>
But the URI is still http://example.com/a&b
, not http://example.com/a&b
.