I\'m trying to write out to URLConnection#getOutputStream, however, no data is actually sent until I call URLConnection#getInputStream. Even if I set URLConnnection#doInput
Although the getInputStream() method can certainly cause a URLConnection object to initiate an HTTP request, it is not a requirement to do so.
Consider the actual workflow:
Step 1 includes the possibility of including data in the request, by way of an HTTP entity. It just so happens that the URLConnection class provides an OutputStream object as the mechanism for providing this data (and rightfully so for many reasons that aren't particularly relevant here). Suffice to say that the streaming nature of this mechanism provides the programmer an amount of flexibility when supplying the data, including the ability to close the output stream (and any input streams feeding it), before finishing the request.
In other words, step 1 allows for supplying a data entity for the request, then continuing to build it (such as by adding headers).
Step 2 is really a virtual step, and can be automated (like it is in the URLConnection class), since submitting a request is meaningless without a response (at least within the confines of the HTTP protocol).
Which brings us to Step 3. When processing an HTTP response, the response entity -- retrieved by calling getInputSteam() -- is just one of the things we might be interested in. A response consists of a status, headers, and optionally an entity. The first time any one of these is requested, the URLConnection will perform virtual step 2 and submit the request.
No matter if an entity is being sent via the connection's output stream or not, and no matter whether a response entity is expected back, a program will ALWAYS want to know the result (as provided by the HTTP status code). Calling getResponseCode() on the URLConnection provides this status, and switching on the result may end the HTTP conversation without ever calling getInputStream().
So, if data is being submitted, and a response entity is not expected, don't do this:
// request is now built, so...
InputStream ignored = urlConnection.getInputStream();
... do this:
// request is now built, so...
int result = urlConnection.getResponseCode();
// act based on this result
(Repost from your first question. Shameless self-plug) Don't fiddle around with URLConnection yourself, let Resty handle it.
Here's the code you would need to write (I assume you are getting text back):
import static us.monoid.web.Resty.*;
import us.monoid.web.Resty;
...
new Resty().text(TEST_URL, content("HELLO WORLD")).toString();
The API for URLConnection and HttpURLConnection are (for better or worse) designed for the user to follow a very specific sequence of events:
If your request is a POST or PUT, you need the optional step #2.
To the best of my knowledge, the OutputStream is not like a socket, it is not directly connected to an InputStream on the server. Instead, after you close or flush the stream, AND call getInputStream(), your output is built into a Request and sent. The semantics are based on the assumption that you will want to read the response. Every example that I've seen shows this order of events. I would certainly agree with you and others that this API is counterintuitive when compared to the normal stream I/O API.
The tutorial you link to states that "URLConnection is an HTTP-centric class". I interpret that to mean that the methods are designed around a Request-Response model, and make the assumption that is how they will be used.
For what it's worth, I found this bug report that explains the intended operation of the class better than the javadoc documentation. The evaluation of the report states "The only way to send out the request is by calling getInputStream."
As my experiments have shown (java 1.7.0_01) the code:
osw = new OutputStreamWriter(urlCon.getOutputStream());
osw.write("HELLO WORLD");
osw.flush();
Doesn't send anything to the server. It just saves what's written there to the memory buffer. Thus in case you're going to upload a large file via POST - you need to be sure that you have enough memory. On desktop/server it may not be such a big problem, but on android that may result in out of memory error. Here's the example of how the stack trace looks when trying to write to output stream, and memory runs out.
Exception in thread "Thread-488" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:78)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
at java.io.Writer.write(Writer.java:157)
at maxela.tables.weboperations.POSTRequest.makePOST(POSTRequest.java:138)
On the bottom of the trace you can see the makePOST()
method which does the following:
writer = new OutputStreamWriter(conn.getOutputStream());
for (int j = 0 ; j < 3000 * 100 ; j++)
{
writer.write("&var" + j + "=garbagegarbagegarbage_"+ j);
}
writer.flush();
And writer.write()
throws the exception.
Also my experiments have shown that any exception related to the actual connection/IO with the server is thrown only after urlCon.getOutputStream()
is called. Even urlCon.connect()
seems to be "dummy" method which doesn't do any physical connection.
However if you call urlCon.getContentLengthLong()
which returns Content-Length: header field from the server response-headers - then URLConnection.getOutputStream() will be called automatically and in case there's exception - it will be thrown.
The exceptions thrown by urlCon.getOutputStream()
are all IOException, and I have met the follwing ones:
try
{
urlCon.getOutputStream();
}
catch (UnknownServiceException ex)
{
System.out.println("UnkownServiceException():" + ex.getMessage());
}
catch (ConnectException ex)
{
System.out.println("ConnectException()");
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
catch (IOException ex) {
System.out.println("IOException():" + ex.getMessage());
Logger.getLogger(POSTRequest.class.getName()).log(Level.SEVERE, null, ex);
}
Hopefully my little research helps to people, as URLConnection class is a bit counter-intuitive in some cases thus, when implementing it - one needs to know what's it deals with.
Second reason is: when working with servers - the work with server may fail because of many reasons (connection, dns, firewall, httpresponses, server not being able to accept connection, server not being able to process request timely). Thus it is important to understand how exceptions raised can explain about what's actually happening with the connection.
Calling getInputStream() signals that the client is finished sending it's request, and is ready to receive the response (per HTTP spec). It seems that the URLConnection class has this notion built into it, and must be flush()ing the output stream when the input stream is asked for.
As the other responder noted, you should be able to call flush() yourself to trigger the write.
The fundamental reason is that it has to compute a Content-length header automatically (unless you are using chunked or streaming mode). It can't do that until it has seen all the output, and it has to send it before the output, so it has to buffer the output. And it needs a decisive event to know when the last output has actually been written. So it uses getInputStream() for that. At that time it writes the headers including the content-length, then the output, then it starts reading the input.