I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).
Once the user is logged in I cre
You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:
string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");
Will yield
http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample
This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:
string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");
Hopefully that helps. Also, as teedyay mentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.
The .NET implementation of UrlEncode
does not comply with RFC 3986.
Some characters are not encoded but should be. The !()*
characters are listed in the RFC's section 2.2 as a reserved characters that must be encoded yet .NET fails to encode these characters.
Some characters are encoded but should not be. The .-_
characters are not listed in the RFC's section 2.2 as a reserved character that should not be encoded yet .NET erroneously encodes these characters.
The RFC specifies that to be consistent, implementations should use upper-case HEXDIG, where .NET produces lower-case HEXDIG.
In addition to @Dan Herbert's answer , You we should encode just the values generally.
Split has params parameter Split('&','='); expression firstly split by & then '=' so odd elements are all values to be encoded shown below.
public static void EncodeQueryString(ref string queryString)
{
var array=queryString.Split('&','=');
for (int i = 0; i < array.Length; i++) {
string part=array[i];
if(i%2==1)
{
part=System.Web.HttpUtility.UrlEncode(array[i]);
queryString=queryString.Replace(array[i],part);
}
}
}
Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix
UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility
, as mentioned.
You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode
does this up front, it is rather easy.
As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode
, so you can round trip the changes.
I think people here got sidetracked by the UrlEncode message. URLEncoding is not what you want -- you want to encode stuff that won't work as a filename on the target system.
Assuming that you want some generality -- feel free to find the illegal characters on several systems (MacOS, Windows, Linux and Unix), union them to form a set of characters to escape.
As for the escape, a HexEscape should be fine (Replacing the characters with %XX). Convert each character to UTF-8 bytes and encode everything >128 if you want to support systems that don't do unicode. But there are other ways, such as using back slashes "\" or HTML encoding """. You can create your own. All any system has to do is 'encode' the uncompatible character away. The above systems allow you to recreate the original name -- but something like replacing the bad chars with spaces works also.
On the same tangent as above, the only one to use is
Uri.EscapeDataString
-- It encodes everything that is needed for OAuth, it doesn't encode the things that OAuth forbids encoding, and encodes the space as %20 and not + (Also in the OATH Spec) See: RFC 3986. AFAIK, this is the latest URI spec.
If you can't see System.Web, change your project settings. The target framework should be ".NET Framework 4" instead of ".NET Framework 4 Client Profile"