A lot of things can go wrong when requesting information over HTTP from a remote web server: requests timeout, servers fail, government operatives cut undersea cables. You get the picture.
Identifying and handling failures helps build fault tolerant systems that stay up even when services they rely on are down. A nice side effect is your phone is less likely to beep in the middle of the night with a message from your coworkers talking in all caps.
This guide will introduce you to the common ways HTTP requests fail and how to handle the failures.
The examples use Python's fantastic library, but the principles shown work across all languages. You can follow along on your computer by grabbing requestsoff PyPi.
The method is the cornerstone for all the examples. It makes a synchronous HTTP GET request to fetch the content from url:
Where possible, the examples use to illustrate the specific failure scenarios. It's a great service for testing how your code will react in a hostile world!
The guide assumes familiarly with making HTTP requests and uses the following terminology:
- Client: The code making the HTTP requests and the server it lives on.
- Server: The box that delivers the HTTP response we requested.
- Caller: The code which instantiates the client and tells it to make a request.
Ready to make some requests? Let's go!
DNS lookup failures
HTTP requests can fail before the client can even make a connection to the server. If the URL specified by the caller has a domain name, the client must look up its IP address before making the request. If the domain name doesn't resolve it's possible that it isn't configured correctly or doesn't exist.
It's important to let the caller know they may have entered the wrong domain!
Errors connecting to the server
Even if the hostname of the URL correctly resolves, we might not always succeed in connecting to the server. If someone tripped on its power cord and took it down, it's unlikely it will accept our connection!
Errors of this nature often block the client, tying it up waiting for a server that will never respond. For this reason, it's a good idea to add timeouts to the client. That way, if the server takes too long to respond, the client can move on to do something else rather than waiting indefinitely. . connect is the amount of time the client should wait to establish a connection to the server:
read is the amount of time it should wait between bytes from the server:
The exact values used for the timeout are usually less important than just setting one. You don't want the client to be blocked forever on a slowpoke server. Start with 10 seconds and watch your logs.
Extra Credit: Depending on the profile of the system you're building, you may want to implement dynamic timeouts that use historical data to wait longer for servers that are known to be slow. You may want to ban your client from even trying to connect to servers that always timeout.
HTTP errors
What if something goes sideways while the server is preparing our response? Maybe its database is unresponsive or it was switched in maintenance mode. Whatever the reason, if the server is able to detect that it isn't functioning correctly, it should respond with a .
Alternatively, if the client is incorrectly constructing the request, the server may respond with a .
In most cases we'll want to identify these bad response status codes and let the caller handle them. With requests, this is as easy as calling the method on the response object:
Responses that aren't what we expect
It's possible that the caller could request a resource that our client wasn't designed to handle. For example, what if someone uses our RSS reader to request an MKV file of the last episode of Game of Thrones?
We can assert that Content-Type response header matches what we expect. Our RSS reader example might look for the following:
Note that even if the Content-Type header does match what we are expecting, there is no guarantee that the response's body will. Calling code should account for this. For example, if we're expecting JSON and we don't get back JSON, that's a problem. In requests, the response.json() method tries to convert the response body into a Python object from JSON:
Extra Credit: If we're processing text data like HTML, don't forget to detect its charset and correctly decode it. You'll need to check the response's Content-Typeheader as well as potentially the content itself to avoid decoding errors.
Responses that are too large
Let's go back to our movie example. Not only is the movie not the content type our RSS reader expects, it's also really big. If we're not careful, these kinds of responses could exhaust our client's resources.
To ensure our client hasn't been asked to download the entire internet, we must track how much content we've received. With requests, :
Requests to unexpected URLs
If the client is located inside your network it may have privileged access to internal servers not addressable from the public internet. For example, what if the caller requests
If you're letting callers request arbitrary URLs, we need to check that they are allowed to request what they are asking for.
One strategy is to prevent callers from requesting sensitive hosts using a blacklist. A blacklist checks whether the requested domain is present in a set of restricted domains. If it is, the request is rejected before it's even made. At a minimum, we'll want to blacklist internal IP addresses.
Python 3.3 added the module to the standard library, and in Python 2 we can install its backport . Here we use it to filter requests for internal IP addresses:
We might extend our blacklist to include internal hostnames or other sensitive servers. Maybe we also don't want callers to call the server doing the calling. Otherwise it could be turtles all the way down.
Extra Credit: If you want to get serious you'll need to resolve the domain name of the requested resource and check whether it maps to a local IP address.
Alternatively, if callers should only be able to request from a narrow set of servers it may be easier to use a whitelist to reject requests which aren't directed at a known host:
Extra Credit: Depending on your needs, you might also want to restrict other parts of the HTTP request, including the protocol used, or the ports. Additionally, if you find a caller abusing the system, you might want to build a mechanism to ban them!
Handling errors
So now that we've identified all these errors, what the heck should be do with them?
Logging
What broke? When? Where? Logging failures creates a trail that you can search for patterns. Logs will often give you insight about how you can further tweak your configuration to best suit your system or whether someone is abusing the system.
Retrying
When you're firing bits around the world sometimes you just get unlucky. Depending on what you're doing, it may make sense to just retry the request if you think the error was intermittent. requests provides an interface for creating custom that can be used to implement retries:
Just make sure you only retry requests that are idempotent!
Notification
Finally, you'll need to raise the error to the caller. You'll want to do it in a way that makes it easy for the caller to handle all possible exceptions, but also in a way that makes it clear why the exception was raised. This is especially important if you will be displaying the error to a non-technical user and you want to provide clear instructions about whether they've mistyped the domain or the server they are trying to connect to is down. In Python, this is a great chance to read up on properly re-raising exceptions!
For Further Consideration
SSL
SSL is pretty cool and we should do more of it. The requests library . If you're using a different library or language, be sure to check that your client is checking that certificates are valid. You don't want someone your connection!
Internationalized Domain Names
are a thing. Many libraries will handle these by default now, but you probably want to throw a test case in there that makes sure the works:
Performance
Depending on how you've built your client, there are a variety of ways you might be able to improve its performance:
- Consider requesting the compressed response content by setting the headerAccept-Content: gzip. You'll need to make sure your the content.
- Consider having your client connect through an HTTP proxy like or. If you expect to be requesting the same resources again and again, the proxy's cache may considerably reduce response times for cacheable resources.
- . That means that your client will only be able to process one request at a time. If your system needs to support many concurrent requests, you might consider going async using libraries like or .
Tooling
There are a number of tools out there that can help simplify putting this all together:
- allows you to quickly test a number of different HTTP response scenarios. It's and endpoints are especially useful for testing weird edge conditions.
- is a mocking library that can make cranking out unit tests for all these errors relatively simple.
Wrapping it all up
Wow, there are a lot of ways HTTP requests can fail. TLDR, when making a request:
- Account for DNS lookup failures
- Set a connection and read timeout
- Be sure to handle HTTP errors
- Check that the response has the content type you expect
- Limit the maximum response size
- Ensure that private URLs are not requestable
- Always. Be. SSLing.
Now it's your turn!
Go forth and write fault tolerant services that request data using HTTP!
Did we miss anything? Let us know in the comments below.