There seems to be a lot of confusion about the correct http status code to return if the user tries to access a page which requires the user to login.
So basically what
Your "TL;DR" doesn't match the "TL" version.
The proper response for requesting a resource that you need authorization to request, is 401.
302 is not the proper response, because, in fact, the resource is not available some place else. The original URL was correct, the client simply didn't have the rights. If you follow the redirect, you do not actually get what you're looking for. You get dropped in to some ad hoc workflow that has nothing to do with the resource.
403 is incorrect. 403 is the "can't get there from here" error. You simply can't see this, I don't care who you are. Some would argue 403 and 404 are similar. The difference is simply with 403, the server is saying "yea, I have it, but you can't", whereas 404 says "I know nothing about what you're talking about." Security wonks would argue that 404 is "safer". Why tell them something they don't need to know.
The problem you are encountering has nothing to do with REST or HTTP. Your problem is trying to set up some stateful relationship between the client and server, manifested in the end via some cookie. The whole resource -> 302 -> Login page is all about user experience using the hack that's known as the Web Browser, which happens to be both, in stock form, a lousy HTTP client and a lousy REST participant.
HTTP has an authorization mechanism. The Authorization header. The user experience around it, in a generic browser, is awful. So no one uses it.
So there is not proper HTTP response (well there is, 401, but don't/can't use that). There is not proper REST response, as REST typically relies on the underlying protocol (HTTP in this case, but we've tackled that already).
So. 302 -> 200 for the login page is all she wrote. That's what you get. If you weren't using the browser, or did everything via XHR or some other custom client, this wouldn't be an issue. You'd just use Authorization header, follow the HTTP protocol, and leverage a scheme like either DIGEST or what AWS uses, and be done. Then you can use the appropriate standards to answer questions like these.
Agreed. REST is just a style, not a strict protocol. Many public web services deviate from this style. You can build your service to return whatever you want. Just make sure your clients know how what return codes to expect.
Personally, I have always used 401 (unauthorized) to indicate an unauthenticated user has requested a resource that requires a login. I then require the client application to guide the user to the login.
I use 400 (bad request) in response to a logon attempt with invalid credentials.
HTTP 302 (moved) seems more appropriate for web applications where the client is a browser. Browsers typically follow the re-direct address in the response. This can be useful for guiding the user to a logon page.
I'm not talking about HTTP authentication here, so that's at least 1 status code we aren't going to use (401 Unauthorized).
Wrong. 401 is part of Hypertext Transfer Protocol (RFC 2616 Fielding, et al.), but not limited to HTTP authentication. Furthermore, it's the only status code indicating that the request requires user authentication.
302 & 200 codes could be used and is easier to implement in some scenarios, but not all. And if you want to obey the specs, 401 is the only correct answer there is.
And 403 is indeed the most wrong code to return. As you correctly stated...
Authorization will not help and the request SHOULD NOT be repeated.
So this is clearly not suitable to indicate that authorization is an option.
I would stick to the standard: 401 Unauthorized
-
UPDATE
To add a little more info, lifting the confusion related to...
The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource.
If you think that's going to stop you from using a 401, you have to remember there's more:
"The field value consists of at least one challenge that indicates the authentication scheme(s) and parameters applicable to the Request-URI."
This "indicating the authentication scheme(s)" means you can opt-in for other auth-schemes!
The HTTP protocol (RFC 2616) defines a simple framework for access authentication schemes, but you don't HAVE to use THAT framework.
In other words: you're not bound to the usual WWW-Auth. You only just MUST indicate HOW your webapp does it's authorization and offer the according data in the header, that's all. According to the specs, using a 401, you can choose your own poison of authorization! And that's where your "webapp" can do what YOU want it to do when it comes to the 401 header and your authorization implementation.
Don't let the specs confuse you, thinking you HAVE to use the usual HTTP authentication scheme. You don't! The only thing the specs really enforce: you just HAVE/MUST identify your webapp's authentication scheme and pass on related parameters to enable the requesting party to start potential authorization attempts.
And if you're still unsure, I can put all this into a simple but understandable perspective: let's say you're going to invent a new authorization scheme tomorrow, then the specs allow you to use that too. If the specs would have restricted implementation of such newer authorization technology implementations, those specs would've been modified ages ago. The specs define standards, but they do not really limit the multitude of potential implementations.
If the user has not provided any credentials and your API requires them, return a 401 - Unauthorized
. That will challenge the client to do so. There's usually little debate about this particular scenario.
If the user has provided valid credentials but they are insufficient to access the requested resource (perhaps the credentials were for a freemium account but the requested resource is only for your paid users), you have a couple of options given the looseness of some of the HTTP code definitions:
403 - Forbidden
. This is more descriptive and is typically understood as, "the supplied credentials were valid but still were not enough to grant access"401 - Unauthorized
. If you're paranoid about security, you might not want to give the extra information back to the client as was returned in (1) above401
or 403
but with helpful information in the response body describing the reasons why access is being denied. Again, that information might be more than you would want to provide in case it helps attackers somewhat.Personally, I've always used #1 for the scenario where valid credentials have been passed but the account they're associated with doesn't have access to the requested resource.
Your Answer:
401 Unauthorized
especially if you do not care or will not be redirecting people to a login page
-or-
302 Found
to imply there was the resource but they need to provide credentials to be returned to it. Do this only if you will be using a redirect and make sure to provide appropriate information in the body of the response.
Other Suggestions:
401 Unauthorized
is generally used for resources the user does not have access to after handling authentication.
403 Forbidden
is a little obscure to me in honesty. I use it when I lock down resources from the file system level, and like your post said, "authorization does not help".
400 Bad Request
is inappropriate as needing to login does not represent malformed syntax.
As you point out, 403 Forbidden
is explicitly defined with the phrase "Authorization will not help", but it is worth noting that the authors were almost certainly referring here to HTTP authorization (which will indeed not help as your site uses a different authorization scheme). Indeed, given that the status code is a signal to the user agent rather than the user, such a code would be correct insofar as any authorization the agent attempts to provide will not assist any further with the required authorization process (c.f. 401 Unauthorized
).
However, if you take that definition of 403 Forbidden
literally and feel it is still inappropriate, perhaps 409 Conflict
might apply? As defined in RFC 2616 §10.4.10:
The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.
There is indeed a conflict with the current state of the resource: the resource is in a "locked" state and such conflict can only be "resolved" through the user providing their credentials and resubmitting the request. The body will include "enough information for the user to recognize the source of the conflict" (it will state that they are not logged-in) and indeed will also include "enough information for the user or user agent to fix the problem" (i.e. a login form).