问题
commons io code :
String resultURL = String.format(GOOGLE_RECOGNIZER_URL, URLEncoder.encode("hello", "UTF-8"), "en-US");
URI uri = new URI(resultURL);
byte[] resultIO = IOUtils.toByteArray(uri);
I got this exception:
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://translate.google.cn/translate_tts?ie=UTF-8&q=hello&tl=en-US&total=1&idx=0&textlen=3
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:654)
at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:635)
at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:617)
at com.renren.intl.soundsns.simsimi.speech.ttsclient.impl.GoogleTTSClient.main(GoogleTTSClient.java:70)
but when I use httpclient, the result is ok.
String resultURL = String.format(GOOGLE_RECOGNIZER_URL, URLEncoder.encode(text, "UTF-8"), "en-US");
HttpClient client = new HttpClient();
GetMethod g = new GetMethod(resultURL);
client.executeMethod(g);
byte[] resultByte = g.getResponseBody();
How this happened?
thanks in advance :)
maven dependencies:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
回答1:
Jon Skeet is right!
For me in case of java.net.URL JVM pass next headers:
User-Agent: Java/1.7.0_10
Host: translate.google.cn
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
In case of Apache HttpClient:
User-Agent: Jakarta Commons-HttpClient/3.1
Host: translate.google.cn
And if you change, the user agent for java.net.URL:
System.setProperty("http.agent", "Jakarta Commons-HttpClient/3.1");
request is successful, without HTTP 403.
Looks like you get 403 error if your user-agent start with: Java
. Any user agent with pattern Java.*
throws 403 error. But if you use this pattern .+Java.*
all is ok.
来源:https://stackoverflow.com/questions/14996140/commons-io-403-for-url-but-httpclient-is-ok