Caching HTTP POST Requests and Responses
Topic: Performance Engineering
The basic purpose of HTTP caching is to provide a mechanism for applications to scale better and perform faster. But HTTP caching is applicable only to idempotent requests, which makes a lot of sense; only idempotent and nullipotent requests yield the same result when run multiple times. In the HTTP world, this fact means that GET requests can be cached but POST requests cannot.
However, there can be cases where an idempotent request cannot be sent using GET simply because that request exceeds the limits imposed by popular Internet software. For example, search APIs typically take a lot of parameters, especially for a product with numerous characteristics, all of which have to be passed as parameters. This situation leads to the question, what then is the recommended way of communicating over the wire when the request contains more parameters than the “permitted” length of a GET request? Here are some of the answers:
- You may want to re-evaluate the interface design if it takes a large number of parameters. Idempotent requests typically require a number of parameters that falls well within the GET limits.
- There is no such hard and fast limit imposed by the specification, so the HTTP specification is not to be blamed. Internet clients and servers do impose a limit. Some of them support up to 8 KB, but the safe bet is to keep the length under 2 KB.
- Send the body in a GET request.
At this point, we come to the realization that all of the above answers are unsatisfactory. They do not address the underlying problem or change the situation whatsoever.
HTTP caching basics
To appreciate the rest of this topic, let’s first go through the caching mechanism quickly.
HTTP caching involves the client, the proxy, and the server. In this post, we will discuss mainly the proxy, which sits between the client and server. Typically, reverse proxies are deployed close to the server, and forward proxies close to the client. Figure 1 shows the basic topology. From the figure, it should be clear that a cache-hit in the forward proxy saves bandwidth and reduces round-trip time (RTT) and latency; and a cache-hit at the reverse proxy reduces the load on the server.
Figure 1. Basic topology of proxy cache deployment in a network
The HTTP specification allows a response from cache if one of the following is satisfied:
- The cached response is consistent with the origin server’s response, had the origin server handled the request – in short, the proxy can guarantee a semantic equivalence between the cached response and the origin server’s response.
- The freshness is acceptable to the client.
- The freshness is not acceptable to the client but an appropriate warning is attached.
The specification has a number of flavors and associated headers and controls. Further details of the specification are available at http://tools.ietf.org/html/rfc2616, and of cache controls at http://tools.ietf.org/html/rfc2616#section-14.9.
A typical proxy caches idempotent requests. The proxy gets the request, examines it for cache headers, and sends it to the server. Then the proxy examines the response and, if it is cacheable, caches it with the URL as the key (along with some headers in certain cases) and the response as the value. This scheme works well with GET requests, because for the same URL repeated invocation does not change the response. Intermediaries can make use of this idempotency to safely cache GET requests. But this is not the case with an idempotent POST request. The URL (and headers) cannot be used as the key because the response could be different – the same URL, but with a different body.
POST body digest
The solution is to digest the POST body (along with a few headers), append the URL with the digest, and use this digest instead of just the URL as the cache key (see Figure 2). In other words, the cache key is modified to include the payload in addition to the URL. Subsequent requests with the same payload will hit the cache rather than the origin server. In practice, we add a few headers and their values to the cache key to establish uniqueness, as appropriate for the use case. Although we don’t have a specific algorithm recommendation, if MD5 is used to digest the body, then Content-MD5 could be used as a header.
Figure 2. Digest-based cache
Now the problem is to distinguish idempotent POST requests from non-idempotent ones. There are a few ways to handle this problem:
- Configure URLs and patterns in the proxy so that it does not cache if there is a match.
- Add context-aware headers to distinguish between different requests.
- Base the cache logic on some naming conventions. For example, APIs with names that start with words like “set”, “add”, and “delete” are not cached and will always hit the origin server.
Handling Non-Idempotent Requests
Here’s how we solve the problem of non-idempotent requests:
- Hit the origin server under any of the following circumstances:
- if the URL is in the configured “DO NOT CACHE” URL list
- if the digests do not match
- after the cache expiry time
- whenever a request to revalidate is received
- Attach a warning saying the content could be stale, thereby accommodating the specification.
- Allow users to hit the origin server by using our client-side tools to turn off the proxy.
We implemented this solution with Apache Traffic Server, customized to cache POST requests and the cache key.
This solution provides the following benefits:
- We speed up repeated requests by not performing the round trip from the proxy to the origin server.
- As a hosted solution, one user’s request speeds not only that user’s subsequent requests but also the requests from other users, provided that the cache is set to be shared across requests and that the header permits it.
- We save the bandwidth between the proxy and the origin server.
Here is a performance comparison of an API invocation deployed as a forward proxy and having a data transfer sum of 20 KB:
Variants of this solution can be used to cache the request or response or both at the forward proxy, reverse proxy, or both.
To get the full benefit, in this solution we deploy a forward proxy at the client end and a reverse proxy at the server end. The client sends the request to the forward proxy, and the proxy does a cache lookup. In the case of a cache miss, the forward proxy digests the body and sends only the digest to the reverse proxy. The reverse proxy looks for a match in the request cache and, if found, sends that request to the server. The difference is we don’t send the full request from the forward proxy to the reverse proxy.
The server sends the response to the reverse proxy, which digests the response and sends only the digest – not the full response (see Figure 3). Essentially, we are saving the POST data from being sent, at the cost of an additional round trip (of the digest key) if the result is a cache miss. In most networks, the RTT between the client and the forward proxy and between the server and the reverse proxy is negligible when compared to the RTT between the client and the server. This fact is because typically the forward proxy and the client are close to each other and in the same LAN; likewise, the reverse proxy and the server are close to each other and in the same LAN. The network latency is between the proxies, where data travels through the Internet.
Figure 3. Cache handshake
This solution can also be applied to just one proxy on either side, at the cost of client or server modification. In such cases, the client or server will have to send the digest instead of the whole body. With the two-proxy architecture, the client and server remain unchanged and, as a result, any HTTP client or server can be optimized.
POST requests typically are large in size. By not having the proxy send the whole request and the whole response, we not only save bandwidth but also save the response time involved in large requests and responses.
Although the savings might seem trivial at first glance, they are not so when it comes to real traffic loads. As we saw, even if the POST response is not cached, we still save bandwidth by not sending the payload. This solution gets even more interesting with distributed caches deployed within the network.
Here is a summary of the benefits of a cache-handshaking topology:
- The request payload travels to the reverse proxy only if there is a cache miss. As a result, both the RTT and bandwidth are improved. The same applies to the response.
- As a hosted solution, one request will help save the other user requests travelling between the proxies.
- There is no technical debt involved. If you remove the proxies, you have a fully HTTP-compliant solution.
HTTP caching is not just for GET requests. By digesting the POST body, handling non-idempotent requests, and distinguishing between idempotent and non-idempotent requests, you can realize a substantial savings in round trips and bandwidth. For further savings, you can employ cache handshaking to send only the digest across the Internet—ideally, by implementing a forward proxy on the client side and a reverse proxy on the server side, but one proxy is sufficient to make a difference.