HttpClient problem with URLs which include curly braces

后端未结

关注

 2  982

I am using HttpClient for my android application. At some point, I have to fetch data from remote locations. Below is the snippet how I made use of HttpClient to get the res

相关标签:

2条回答

失恋的感觉

2020-12-19 10:50
The strict answer is that you should never have curly braces in your URL

A full description of valid URL's can be found in RFC1738

The pertinent part for this answer is as follows

Unsafe:

Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".

All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.

In order to bypass the problem you have been experiencing you must encode your url.

The problem you experienced with the "host may not be null" error will happen when the entire url is being encoded including the https://mydomain.com/ part so it gets confused. You only want to encode the last part of the URL called the path.

The solution is to use the Uri.Builder class to build your URI from the individual parts which should encode the path in the process

You will find a detailed description in the Android SDK Uri.Builder reference documentation

Some trivial examples using your values are:
```
Uri.Builder b = Uri.parse("https://mydomain.com").buildUpon();
b.path("/abc/{5D/{B0blhahblah-blah}I1.jpg");
Uri u = b.build();
```
Or you can use chaining:
```
    Uri u = Uri.parse("https://mydomain.com").buildUpon().path("/abc/{5D/{B0blhahblah-blah}I1.jpg").build();
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
刺人心

2020-12-19 11:01

Except RFC1738 has been obsolete for over a decade, has been superseded by rfc3986 and there is no indication in:

https://tools.ietf.org/html/rfc3986

That curly braces are unsafe (In fact, the RFC does not contain a single curly brace character anywhere). Furthermore, I've tried URI's in browsers that contain curly braces, and they work fine.

Also note the OP is using a class called URI - which should definitely be following 3986, at the very least, if not 3987.

However, oddly, IRIs defined in:

https://tools.ietf.org/html/rfc3987

Have the note that:

Systems accepting IRIs MAY also deal with the printable characters in US-ASCII that are not allowed in URIs, namely "<", ">", '"', space, "{", "}", "|", "\", "^", and "`", in step 2 above. If these characters are found but are not converted, then the conversion
SHOULD fail. Please note that the number sign ("#"), the percent
sign ("%"), and the square bracket characters ("[", "]") are not part of the above list and MUST NOT be converted.

In other words, it looks like the RFCs themselves have some issues.

0 讨论(0)
发布评论:

提交评论
- 加载中...