Web API Design Anti-patterns (or how to give consumers a headache)

bytedev
10 min readSep 6, 2021

--

When building a web based API there are many design aspects to consider including the endpoint format, the request and response structure, how to handle errors etc. This article will help to highlight some of the poor design decisions web API designers sometimes make and how these decisions have painful repercussions for developers that want to consume the API. For each problem a discussion of possible improvements to the API is provided.

Note that this list is far from exhaustive and will most likely be added to over time.

“Don’t think too much about the API, we can always change it later”

Unhelpful status codes

HTTP status codes can be used to convey information about the status of the response without the client having to necessarily examine the response body content. HTTP defines five broad categories of statuses:

  • 1xx : Informational
  • 2xx : Successful
  • 3xx : Redirection
  • 4xx : Client errors
  • 5xx : Server errors

Inappropriate use of status codes, such as returning 200 OK on errors, can cause confusion. An API may even return the same status code regardless of the type of response effectively rendering the status code as useless.

Improvements: use the status code to convey the type of response based on the five categories. Define more specific status codes for certain scenarios, for example 422 Unprocessable Entity for request validation errors. Once you have defined a set of status codes your API uses document them for the consumer and be consistent in their use.

Multiple data formats

An API may use multiple different data formats in different situations for the request/response. Every time another data format is used an API client has to write code to handle the extra data format. This wastes time when developing the client as well as adding needless code bloat.

An API can take a step further in the wrong direction by specifying a response content-type header that is a different data format to the actual response content body. For example the response content-type might be specified as application/json but the content body actually contains XML.

Improvements: pick a default data format, for example JSON, for your entire API (all endpoints). If there is a need for your API to support multiple data formats, such as JSON and XML, then provide a way for the client to specify which format they want to use. HTTP already has a defined way to do this through use of the Accept request header. Make sure your response Content-Type header always reflects the format of the data in the response body.

Multiple data structures

An API may return multiple different response content data structures for the same endpoint depending on the scenario. For example upon success in a specific scenario the response content is returned in structure A and in a different scenario the response content is returned in structure B. Using different data structures for the same endpoint means the consumer may have to write more complex code to handle the two different scenarios and may even mean they can’t use deserialization.

Reporting errors in different ways dependent on the types (or even number) of errors can also be a problem. Consider the examples below which demonstrate two different ways a single endpoint is reporting errors back to the client.

Two JSON examples of different ways errors are returned from the same endpoint.

In the example above there was no need to have both the error message and messages fields. An error message object could simple have been put in the message field (a single element array) and had its description set.

Improvements: make sure all scenarios for a single endpoint are handled by a single data structure. This will make serialization/deserialization easy for the client. Determine a single unified way to handle errors and be consistent across all API endpoints. Doing so will allow consumers to write generic code to handle API errors which can be easily reused.

Not well formed response content

An API may have decided upon a particular data format for it’s response content but the implementation it uses to create the data maybe flawed. For example JSON may be the documented data format for response content data but in certain scenarios the data sent is not actually JSON (i.e. it’s not well formed).

Improvements: an endpoint’s response content should be in a single data format. This data should be well formed and conform to the data format’s specification. API’s sometimes fall into the trap of deciding upon a data format but then not always returning well formed content. This can happen when the API does not use a well built library to create the particular format of data. For example historically it was often common to simply use a string object to create XML data. This meant that even though the data looked like it was XML it was often in fact not. This in turn meant the client could no longer rely on deserialization and instead had to parse the data as though it were a simple string.

An API should make sure the format of the data it sends in the response is well formed before it actually sends it to the client.

Inconsistent field names

The API sends data in a single data type and is well formed but uses an inconsistent field name standard. Field names can be inconsistent by name or by casing:

  • If a field name is inconsistent by name it will often use different words to describe the same piece of data. For example a customer identifier might be called a customerId in one place and a customerRef in another place. This difference in language can lead to confusion for the consumer of the API.
  • If a field name is inconsistent by case it will mix different casing styles within the same request/response. For example CustomerId in one place and customer-id in another. It should be noted that many data formats, such as JSON, are case sensitive so AddressLine1 and addressLine1 are not the same.

Improvements: decide upon a naming standard for the data you will return in the response and stick to it across all endpoints in your API. Standardize a language to describe the various pieces of data in the API’s request/response contract. Further more decide upon a casing standard and stick to it across all requests and responses.

For reference some of the more popular case naming standards are:

  • Camel case: myField
  • Pascal case: MyField
  • Snake case: my_field
  • Kebab case: my-field

Dynamic field names

Following on from the “Inconsistent field names” section above a response’s content may even use dynamic field names. A dynamic field name is one that can be different depending on the scenario of the request.

The example below uses dynamic field names customer and orders as part of an object called errors:

JSON example of a response using dynamic field names.

Dynamic field names can create problems for the consumer. If the consumer cannot rely on the names of the fields in the response content then it can no longer use simple deserialization. In the case above the consumer would probably instead have bespoke code to look through the errors object and then iterate over each of its fields, examining the data type of the field and then acting accordingly.

Improvements: unless there is a very good reason then dynamic field names should be avoided. If the dynamic field name itself is important then use a static field name and set the value to what the dynamic field name would have been. For example in the case above a field called fieldName could be added and set to “customer” or “orders”.

Inconsistent field data type

Some data formats, such as JSON, have their own set of valid data types for values. For example JSON has numbers, strings, booleans, objects, arrays and nulls. An API might return the same field as a different data type dependent on the scenario.

Consider the following simple JSON based example. In scenario A status field is a number, in scenario B status is a string, and in scenario C it is boolean. If the client that consumes the API is written in a strongly typed language this can cause inconvenience. For example if the client were written in C# and the client wants to use deserialization on the response content then it would have to define the status field as a more generic type, such as the .NET object type.

Improvements: all fields should have one data type and stick to it. Do not reuse fields for different purposes (often why the data type changes between responses). As the designer of an API if you are using JSON and are not sure what data type to use then it’s probably safest to go with a string.

No clear versioning strategy

An API may have no clear versioning strategy. Instead when breaking changes are deployed they break functionality for the consumers of the API.

Improvements: there are a number of different versioning strategies we can use when building a web based API:

  • Path: endpoints have the version number as part of the URI’s path. For example the /v1 in: https://myapi.com/api/v1/customers.
  • Query string: endpoints have the version number as part of the URI in a query string name/value pair. For example: https://myapi.com/api/customers?version=2.
  • Custom request header: use a custom HTTP request header with the version number as an attribute. For example: Api-version: 2.
  • Accept header: use the HTTP Accept request header to define the version number. The Accept header can be used as part of content negotiation to define the content data type and version the client wants the API to use. For example: Accept: application/myaccount.v2+json.

Each method has its own set of advantage and disadvantages but decide on one at the beginning of your API design and document it for your consumers. If you cannot decide then often using the path method is the simplest.

Inconsistent API domain language

As mentioned in the section “Inconsistent field names” an API may use an inconsistent domain language. This can go beyond simple field names to the rest of the API contract. For example API error messages may talk with inconsistent language. An error message may even talk with the domain language of a backend system behind the API which may use a different domain language to the API itself.

Improvements: define a language for all parts of the API contract (not just the field names). If a customer is called a “customer” in one part of the API contract don’t call them a “client” in another. Be wary of the domain language of systems that the API uses leaking into the domain language of the API itself.

Leaky error messages

An API may on occasion return internally specific error details in the response. This as well as being a pain to the consumer may in turn be a security risk. For example in .NET a request may cause a NullReferenceException to be thrown and the full exception (with stack trace) be returned in the response because a request field was missing.

Improvements: never throw internal server errors back to the consumer of your API in production. Instead return a response code of 500 Internal Server Error. Note that it might be a good idea to return internal error/exception information in the response in a development/testing environment for debugging purposes, but make sure this feature does not go to production.

Poor documentation

A lot of APIs are published with poor documentation. This causes nothing but problems for consumers that want to use the API. How does a client know what request data to send (or where) if it is not defined? Similarly, how does a client know what data to expect from the API if it is also not defined?

Improvements:

  • Think about the type of useful documentation that could be provided. PDF describing the API? Postman collections? online API portal? …
  • Document the fields of the request and response and their data types (if your data format used has data types). Also document any further request field constraints. For example a field customerId is a string, is mandatory and needs to be 10 characters in length.
  • Document examples of requests and responses in different scenarios. Postman collections with examples of requests and responses can also help with this.
  • Define how errors should be handled and the possible errors your API can return.
  • If your API uses a form of authentication remember to include details on how the consumer is supposed to authenticate and what your API expects.
  • Lastly, simply try to think from the API consumers point of view. What would be helpful to them?

Redundant request/response information

Asking the client to provide information in the request that the API isn’t using or could have worked out/derived itself is redundant information. Returning information in the response that is not relevant/useful to any client is redundant information. Redundant information in the request/response is rarely a good thing:

  • It needlessly increases the complexity of the contract between the API and it’s clients. Greater contract complexity leads to more work for client developers wishing to use the API.
  • It needlessly increases the size of the request/response over the wire so potentially reducing performance.
  • The greater number of fields in the request/response between an API and it’s clients the greater potential for tighter coupling. Tighter coupling means the API can be harder to change later.

A lack of analysis in ascertaining the required request/response fields can lead to redundant request/response information. This approach can be thought of as “throw everything at the wall and hope something sticks”.

Redundancy in the request/response can also often be because the API is being lazy and is instead pushing (leaking) work/logic out to it’s clients. Take this API example for a request ID field, the format being:

<vehicle_number>_<operation_called>_<datetime_now>

For example: ABC1234_GETDETAILS_20220113120000

This ID is a composite of a vehicle number that is already being supplied elsewhere in the request, the name of the operation being called on the API and the date time now. All three pieces of this information are known to the API upon the request being made and the API can if it requires derive this ID. So why is the client being asked to create and supply it?

Improvements: generally speaking the smaller the size of contract between API and client the better. Think before putting a field on the request/response is the field actually needed? If the field is on the request does the API need it or could it be derived/worked out by the API itself? If the field is on the response do any potential clients really need it? Beware that once a field is on the request/response you have created coupling between the API and client that can be difficult to undo later.

Final thoughts

A common theme with most of the sections above is consistency. At every point in the design of a API be consistent. This will help the consumer get used to how your API works and go on to create client applications/SDKs quickly with fewer issues.

Aside from obviously testing your API, often the best way to find out problems with your API is simply to use it. Write a client application that tries to use your API and make improvements to the API accordingly. In fact the best designs for APIs are ones that were built entirely from the consumers point of view and what the consumer wants and how they expect to use the API. After all what’s the point of an API with no consumers?

--

--

bytedev
bytedev