I am sending a file to a server as an octet-stream, and I need to specify the filename in the header:
String filename = \"«úü¡»¿.doc\"
URL url = new URL(\"ht
Actually, you can use non-ASCII characters in header (see RFC 2616):
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
TEXT = <any OCTET except CTLs,
but including LWS>
CTL = <any US-ASCII control character
(octets 0 - 31) and DEL (127)>
LWS = [CRLF] 1*( SP | HT )
CRLF = CR LF
CR = <US-ASCII CR, carriage return (13)>
LF = <US-ASCII LF, linefeed (10)>
SP = <US-ASCII SP, space (32)>
HT = <US-ASCII HT, horizontal-tab (9)>
You cannot use non ASCII character in HTTP headers, see the RFC 2616. URI are themselves standardized by RFC 2396 and don't permit non-ASCII either. The RFC says :
The URI syntax was designed with global transcribability as one of its main concerns. A URI is a sequence of characters from a very limited set, i.e. the letters of the basic Latin alphabet, digits, and a few special characters.
In order to use non ASCII characters in URI you need to escape them using the %hexcode syntax (see section 2 of RFC 2396).
In Java you can do this using the java.net.URLEncoder
class.
2020 edit: RFC 2616 has been updated and the relevant section on header syntax is now at https://tools.ietf.org/html/rfc7230#section-3.2
header-field = field-name ":" OWS field-value OWS
field-name = token
field-value = *( field-content / obs-fold )
field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar = VCHAR / obs-text
obs-fold = CRLF 1*( SP / HTAB )
; obsolete line folding
; see Section 3.2.4
Where VCHAR is defined in https://tools.ietf.org/html/rfc7230#section-1.2 as "any visible [USASCII] character". With the [USASCII] reference being
[USASCII] American National Standards Institute, "Coded Character
Set -- 7-bit American Standard Code for Information
Interchange", ANSI X3.4, 1986.
The standards are still very clear, HTTP header are still US-ASCII ONLY
This might help: HTTP headers encoding/decoding in Java