The HTTP Protocol
I have to say just a few words about HTTP.
Some HTTP support is ubiquitous, but it’s also vaguely inefficient in Python. Not only it can’t be handled in separate thread like in zeromq, and not just HTTP parsing is slow. Often keep-alive connections are not supported well.
Also HTTP is complex. If you think it isn’t, you’re wrong. Just to give you a little example. You may write code like below:
1
2
3
4
5
|
def simple_app(environ, start_response):
status = '304 Not Modified'
headers = [('Content-Length', '5')]
start_response(status, headers)
return [b'hello']
|
What happens here is that server returns a page with word “hello” in the body of response (tested with wsgiref). But what spec says:
The 304 response MUST NOT contain a message-body, and thus is always terminated by the first empty line after the header fields
What this means is that the “hello” line will be treated by client as a first line of the response on the nextrequest. Which in some setups leads to poisoning cache and security vulnerabilities. How many programmers that use HTTP are aware of this fact? There are many more subtle details.
The HTTP must not be used for internal messaging as it’s easy, but not simple, quite the contrary it’s complicated protocol with 5 RFCs describing only basics. And misunderstanding smallest part of spec may lead to a security vulnerability
Frankly, most microservices use subset of HTTP which for example only recognize response code 200 (and all others as failure) don’t use special headers and similar, and may never run into issues. Still this is not real HTTP (but proxies, i.e. HAProxy often used for load-balancing, expect real HTTP), and one should be very fluent in HTTP spec to find out the safe subset of it.