Understanding REST: Verbs, error codes, and authentication

后端 未结 10 1809
你的背包
你的背包 2020-11-22 07:39

I am looking for a way to wrap APIs around default functions in my PHP-based web applications, databases and CMSs.

I have looked around and found several \"skeleton

相关标签:
10条回答
  • 2020-11-22 07:55

    REST Basics

    REST have an uniform interface constraint, which states that the REST client must rely on standards instead of application specific details of the actual REST service, so the REST client won't break by minor changes, and it will probably be reusable.

    So there is a contract between the REST client and the REST service. If you use HTTP as the underlying protocol, then the following standards are part of the contract:

    • HTTP 1.1
      • method definitions
      • status code definitions
      • cache control headers
      • accept and content-type headers
      • auth headers
    • IRI (utf8 URI)
    • body (pick one)
      • registered application specific MIME type, e.g. maze+xml
      • vendor specific MIME type, e.g. vnd.github+json
      • generic MIME type with
        • application specific RDF vocab, e.g. ld+json & hydra, schema.org
        • application specific profile, e.g. hal+json & profile link param (I guess)
    • hyperlinks
      • what should contain them (pick one)
        • sending in link headers
        • sending in a hypermedia response, e.g. html, atom+xml, hal+json, ld+json&hydra, etc...
      • semantics
        • use IANA link relations and probably custom link relations
        • use an application specific RDF vocab

    REST has a stateless constraint, which declares that the communication between the REST service and client must be stateless. This means that the REST service cannot maintain the client states, so you cannot have a server side session storage. You have to authenticate every single request. So for example HTTP basic auth (part of the HTTP standard) is okay, because it sends the username and password with every request.

    To answer you questions

    1. Yes, it can be.

      Just to mention, the clients do not care about the IRI structure, they care about the semantics, because they follow links having link relations or linked data (RDF) attributes.

      The only thing important about the IRIs, that a single IRI must identify only a single resource. It is allowed to a single resource, like an user, to have many different IRIs.

      It is pretty simple why we use nice IRIs like /users/123/password; it is much easier to write the routing logic on the server when you understand the IRI simply by reading it.

    2. You have more verbs, like PUT, PATCH, OPTIONS, and even more, but you don't need more of them... Instead of adding new verbs you have to learn how to add new resources.

      activate_login -> PUT /login/active true deactivate_login -> PUT /login/active false change_password -> PUT /user/xy/password "newpass" add_credit -> POST /credit/raise {details: {}}

      (The login does not make sense from REST perspective, because of the stateless constraint.)

    3. Your users do not care about why the problem exist. They want to know only if there is success or error, and probably an error message which they can understand, for example: "Sorry, but we weren't able to save your post.", etc...

      The HTTP status headers are your standard headers. Everything else should be in the body I think. A single header is not enough to describe for example detailed multilingual error messages.

    4. The stateless constraint (along with the cache and layered system constraints) ensures that the service scales well. You surely don't wan't to maintain millions of sessions on the server, when you can do the same on the clients...

      The 3rd party client gets an access token if the user grants access to it using the main client. After that the 3rd party client sends the access token with every request. There are more complicated solutions, for example you can sign every single request, etc. For further details check the OAuth manual.

    Related literature

    • Architectural Styles and the Design of Network-based Software Architectures
      Dissertation of Roy Thomas Fielding (author of REST)
      2000, University of California, Irvine
    • Third Generation Web APIs - Bridging the Gap between REST and Linked Data
      Dissertation of Markus Lanthaler (co-author of JSON-LD and author of Hydra)
      2014, Graz University of Technology, Austria
    0 讨论(0)
  • 2020-11-22 07:56

    I noticed this question a couple of days late, but I feel that I can add some insight. I hope this can be helpful towards your RESTful venture.


    Point 1: Am I understanding it right?

    You understood right. That is a correct representation of a RESTful architecture. You may find the following matrix from Wikipedia very helpful in defining your nouns and verbs:


    When dealing with a Collection URI like: http://example.com/resources/

    • GET: List the members of the collection, complete with their member URIs for further navigation. For example, list all the cars for sale.

    • PUT: Meaning defined as "replace the entire collection with another collection".

    • POST: Create a new entry in the collection where the ID is assigned automatically by the collection. The ID created is usually included as part of the data returned by this operation.

    • DELETE: Meaning defined as "delete the entire collection".


    When dealing with a Member URI like: http://example.com/resources/7HOU57Y

    • GET: Retrieve a representation of the addressed member of the collection expressed in an appropriate MIME type.

    • PUT: Update the addressed member of the collection or create it with the specified ID.

    • POST: Treats the addressed member as a collection in its own right and creates a new subordinate of it.

    • DELETE: Delete the addressed member of the collection.


    Point 2: I need more verbs

    In general, when you think you need more verbs, it may actually mean that your resources need to be re-identified. Remember that in REST you are always acting on a resource, or on a collection of resources. What you choose as the resource is quite important for your API definition.

    Activate/Deactivate Login: If you are creating a new session, then you may want to consider "the session" as the resource. To create a new session, use POST to http://example.com/sessions/ with the credentials in the body. To expire it use PUT or a DELETE (maybe depending on whether you intend to keep a session history) to http://example.com/sessions/SESSION_ID.

    Change Password: This time the resource is "the user". You would need a PUT to http://example.com/users/USER_ID with the old and new passwords in the body. You are acting on "the user" resource, and a change password is simply an update request. It's quite similar to the UPDATE statement in a relational database.

    My instinct would be to do a GET call to a URL like /api/users/1/activate_login

    This goes against a very core REST principle: The correct usage of HTTP verbs. Any GET request should never leave any side effect.

    For example, a GET request should never create a session on the database, return a cookie with a new Session ID, or leave any residue on the server. The GET verb is like the SELECT statement in a database engine. Remember that the response to any request with the GET verb should be cache-able when requested with the same parameters, just like when you request a static web page.


    Point 3: How to return error messages and codes

    Consider the 4xx or 5xx HTTP status codes as error categories. You can elaborate the error in the body.

    Failed to Connect to Database: / Incorrect Database Login: In general you should use a 500 error for these types of errors. This is a server-side error. The client did nothing wrong. 500 errors are normally considered "retryable". i.e. the client can retry the same exact request, and expect it to succeed once the server's troubles are resolved. Specify the details in the body, so that the client will be able to provide some context to us humans.

    The other category of errors would be the 4xx family, which in general indicate that the client did something wrong. In particular, this category of errors normally indicate to the client that there is no need to retry the request as it is, because it will continue to fail permanently. i.e. the client needs to change something before retrying this request. For example, "Resource not found" (HTTP 404) or "Malformed Request" (HTTP 400) errors would fall in this category.


    Point 4: How to do authentication

    As pointed out in point 1, instead of authenticating a user, you may want to think about creating a session. You will be returned a new "Session ID", along with the appropriate HTTP status code (200: Access Granted or 403: Access Denied).

    You will then be asking your RESTful server: "Can you GET me the resource for this Session ID?".

    There is no authenticated mode - REST is stateless: You create a session, you ask the server to give you resources using this Session ID as a parameter, and on logout you drop or expire the session.

    0 讨论(0)
  • 2020-11-22 08:04

    Verbose, but copied from the HTTP 1.1 method specification at http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

    9.3 GET

    The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response and not the source text of the process, unless that text happens to be the output of the process.

    The semantics of the GET method change to a "conditional GET" if the request message includes an If-Modified-Since, If-Unmodified-Since, If-Match, If-None-Match, or If-Range header field. A conditional GET method requests that the entity be transferred only under the circumstances described by the conditional header field(s). The conditional GET method is intended to reduce unnecessary network usage by allowing cached entities to be refreshed without requiring multiple requests or transferring data already held by the client.

    The semantics of the GET method change to a "partial GET" if the request message includes a Range header field. A partial GET requests that only part of the entity be transferred, as described in section 14.35. The partial GET method is intended to reduce unnecessary network usage by allowing partially-retrieved entities to be completed without transferring data already held by the client.

    The response to a GET request is cacheable if and only if it meets the requirements for HTTP caching described in section 13.

    See section 15.1.3 for security considerations when used for forms.

    9.5 POST

    The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions:

      - Annotation of existing resources;
      - Posting a message to a bulletin board, newsgroup, mailing list,
        or similar group of articles;
      - Providing a block of data, such as the result of submitting a
        form, to a data-handling process;
      - Extending a database through an append operation.
    

    The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database.

    The action performed by the POST method might not result in a resource that can be identified by a URI. In this case, either 200 (OK) or 204 (No Content) is the appropriate response status, depending on whether or not the response includes an entity that describes the result.

    If a resource has been created on the origin server, the response SHOULD be 201 (Created) and contain an entity which describes the status of the request and refers to the new resource, and a Location header (see section 14.30).

    Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields. However, the 303 (See Other) response can be used to direct the user agent to retrieve a cacheable resource.

    POST requests MUST obey the message transmission requirements set out in section 8.2.

    See section 15.1.3 for security considerations.

    9.6 PUT

    The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI. If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response. If an existing resource is modified, either the 200 (OK) or 204 (No Content) response codes SHOULD be sent to indicate successful completion of the request. If the resource could not be created or modified with the Request-URI, an appropriate error response SHOULD be given that reflects the nature of the problem. The recipient of the entity MUST NOT ignore any Content-* (e.g. Content-Range) headers that it does not understand or implement and MUST return a 501 (Not Implemented) response in such cases.

    If the request passes through a cache and the Request-URI identifies one or more currently cached entities, those entries SHOULD be treated as stale. Responses to this method are not cacheable.

    The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request -- the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource. If the server desires that the request be applied to a different URI,

    it MUST send a 301 (Moved Permanently) response; the user agent MAY then make its own decision regarding whether or not to redirect the request.

    A single resource MAY be identified by many different URIs. For example, an article might have a URI for identifying "the current version" which is separate from the URI identifying each particular version. In this case, a PUT request on a general URI might result in several other URIs being defined by the origin server.

    HTTP/1.1 does not define how a PUT method affects the state of an origin server.

    PUT requests MUST obey the message transmission requirements set out in section 8.2.

    Unless otherwise specified for a particular entity-header, the entity-headers in the PUT request SHOULD be applied to the resource created or modified by the PUT.

    9.7 DELETE

    The DELETE method requests that the origin server delete the resource identified by the Request-URI. This method MAY be overridden by human intervention (or other means) on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. However, the server SHOULD NOT indicate success unless, at the time the response is given, it intends to delete the resource or move it to an inaccessible location.

    A successful response SHOULD be 200 (OK) if the response includes an entity describing the status, 202 (Accepted) if the action has not yet been enacted, or 204 (No Content) if the action has been enacted but the response does not include an entity.

    If the request passes through a cache and the Request-URI identifies one or more currently cached entities, those entries SHOULD be treated as stale. Responses to this method are not cacheable.

    0 讨论(0)
  • 2020-11-22 08:06

    About REST return codes: it is wrong to mix HTTP protocol codes and REST results.

    However, I saw many implementations mixing them, and many developers may not agree with me.

    HTTP return codes are related to the HTTP Request itself. A REST call is done using a Hypertext Transfer Protocol request and it works at a lower level than invoked REST method itself. REST is a concept/approach, and its output is a business/logical result, while HTTP result code is a transport one.

    For example, returning "404 Not found" when you call /users/ is confuse, because it may mean:

    • URI is wrong (HTTP)
    • No users are found (REST)

    "403 Forbidden/Access Denied" may mean:

    • Special permission needed. Browsers can handle it by asking the user/password. (HTTP)
    • Wrong access permissions configured on the server. (HTTP)
    • You need to be authenticated (REST)

    And the list may continue with '500 Server error" (an Apache/Nginx HTTP thrown error or a business constraint error in REST) or other HTTP errors etc...

    From the code, it's hard to understand what was the failure reason, a HTTP (transport) failure or a REST (logical) failure.

    If the HTTP request physically was performed successfully it should always return 200 code, regardless is the record(s) found or not. Because URI resource is found and was handled by the http server. Yes, it may return an empty set. Is it possible to receive an empty web-page with 200 as http result, right?

    Instead of this you may return 200 HTTP code and simply a JSON with an empty array/object, or to use a bool result/success flag to inform about the performed operation status.

    Also, some internet providers may intercept your requests and return you a 404 http code. This does not means that your data are not found, but it's something wrong at transport level.

    From Wiki:

    In July 2004, the UK telecom provider BT Group deployed the Cleanfeed content blocking system, which returns a 404 error to any request for content identified as potentially illegal by the Internet Watch Foundation. Other ISPs return a HTTP 403 "forbidden" error in the same circumstances. The practice of employing fake 404 errors as a means to conceal censorship has also been reported in Thailand and Tunisia. In Tunisia, where censorship was severe before the 2011 revolution, people became aware of the nature of the fake 404 errors and created an imaginary character named "Ammar 404" who represents "the invisible censor".

    0 讨论(0)
提交回复
热议问题