When you’re designing, testing, or releasing a new Web API, you’re building a new system on top of an existing complex and sophisticated system. At a minimum, you’re building upon HTTP, which is built upon TCP/IP, which is built upon a series of tubes. You’re also building upon a web server, an application framework, and maybe an API framework.
Most people, myself included, are not aware of all the intricacies and nuances of every component they’re building upon. Even if you deeply understand each component, it’s probably going to be too much information to hold in your head at one time.
“We know there are known knowns: there are things we know we know. We also know there are known unknowns: that is to say we know there are things we know we don’t know. But there are also unknown unknowns – the ones we don’t know we don’t know.” –Defense Secretary Donald Rumsfeld, Defense Department briefing, Feb 12, 2002
I don’t want you to build an API and only know about your unknown unknowns when they bite you in the ass. So, here’s a list of a bunch of things, both obvious and subtle, that can easily be missed when designing, testing, implementing, and releasing a Web API.
HTTP
The HTTP 1.1 specification, RFC2616, is a hefty document at 54,121 words. Here are some select items from the spec that might affect your API design:
1. Idempotent methods
GET, HEAD, PUT, DELETE, OPTIONS and TRACE are all intended to be idempotent operations; that is, “the side-effects of N > 0 identical requests is the same as for a single request.” (RFC2616 §9.1.2)
2. Authentication
Most APIs will need a way to identify and authenticate the user accessing the API. HTTP provides the Authorization header (RFC2616 §14.8) for this purpose. RFC2617 specifies specific authentication schemes, including the most common, HTTP Basic authentication.
Many popular APIs use HTTP Basic Authentication with an API Key as either the username or the password. This is a simple and effective authentication mechanism.
To implement HTTP Authentication accurately, you should provide a 401 status code with a WWW-Authenticate header if a request is not permitted due to a lack of authentication. Many clients will require this response before sending an Authorization header on a subsequent request.
3. 201 Created
Use the “201 Created” response code to indicate that the request was processed successfully and resulted in the creation of a new resource. 201 responses can include the new resource URI in the Location header. [(RFC2616 §10.2.2)]
4. 202 Accepted
Use the “202 Accepted” response code to indicate that the request is valid and will be processed, but has not been completed. Typically this is used where a processing queue might be present in the background on the server side. [(RFC2616 §10.2.3)]
5. 4xx versus 5xx status codes
There’s an important distinction between 4xx and 5xx status codes: 4xx codes are intended to indicate a client-side error, and a 5xx code indicates a server-side error. Proper usage of these status code classes can help application developers understand whether they did something wrong (4xx) or something is broken (5xx). (RFC2616 §6.1.1) (Edit: Read more about the difference at Is “404 Not Found” really a client error?.)
6. 410 Gone
The “410 Gone” response code is an under-utilized response code that informs the client that a resource used to be present at this URL, but no longer is. This can be used in your API to indicate deleted, archived, or expired items. (RFC2616 §10.4.11)
7. Expect: 100-Continue
If an API client is about to send a request with a large entity body, like a POST, PUT, or PATCH, they can send “Expect: 100-continue” in their HTTP headers, and wait for a “100 Continue” response before sending their entity body. This allows the API server to verify much of the validity of the request before wasting bandwidth to return an error response (such as a 401 or a 403). Supporting this functionality is not very common, but it can improve API responsiveness and reduce bandwidth in some scenarios. (RFC2616 §8.2.3)
8. Connection Keep-Alive
Maintaining a connection with your API server for multiple API requests can be a big performance improvement. Pretty much every web server should support keep-alive connections if configured correctly.