In order to process a large amount of data in Yandex.Metrika, Yandex created a column-oriented DBMS ClickHouse. During penetration tests, we encountered ClickHouse in the systems of statistics, event collection, stock exchange systems.
The main advantages of ClickHouse are speed, flexibility, scalability, and the main application areas are analytics systems.
Column-oriented DBMSs provide high performance for tasks in which it is necessary to query data from one column frequently, while the rest of the data from the row is not needed.
Due to the structure of the stored data, the only the needed columns are accessed, which significantly increases the performance. It also makes it possible to compress the data more efficiently during storage, because the data in one column is homogeneous.
See more details: https://en.wikipedia.org/wiki/Column-oriented_DBMS
The Clickhouse developers outlined several significant limitations that exist in the DBMS:
- Limited support for JOIN
- No UPDATE or DELETE
- Weak compatibility with SQL standard
Lack of common UPDATE and DELETE operators can be explained with peculiarities of storage architecture. The data is stored in columns and in order to simply delete some record it is necessary to read and mutate n columns.
However, support for ALTER UPDATE and ALTER DELETE was added in 2018. This is how they differ from standard UPDATE and DELETE:
- Executed asynchronously
- Do not block inserts
- Do not block queries
- Do not block each other
Unlike some SQL dialects, ClickHouse has strict typing. There are no implicit conversions between types.
Instead of the standard UNION, ClickHouse only supports UNION ALL to combine requests:
The useful (for hackers) INTO OUTFILE expression works only in the console client. There is no access to it via the HTTP-interface (https://github.com/yandex/ClickHouse/blob/9a0c3f4b820b4bf7bb2307f32dd902fdc7c4241d/dbms/programs/server/HTTPHandler.cpp#L625).
There 2 connection types for ClickHouse:
- Native TCP
Both protocols can also be wrapped in TLS, but in practice, it’s rarely done.
The HTTP interface in ClickHouse makes it easy to write a connector to the DB in any language.
When using the HTTP GET method, only requests that do not change the data are allowed, and in order to change or create records, you need to use the POST method. At the same time, the request itself can be transmitted both in the body of the POST request and in the query string (both GET and POST requests).
How do I find ClickHouse online? Below is a sample request to search for publicly available instances through Shodan:
The ClickHouse language supports the tabular url function, which allows you to access remote hosts via the HTTP and HTTPS protocols (only).
There is also a tabular file function that allows you to read files from a certain directory, and the restriction cannot be bypassed.
Since ClickHouse has a MySQL client built-in, of course, we immediately thought about reading arbitrary files via LOAD DATA LOCAL INFILE. But it turned out that such a vulnerability did exist, and it had already been fixed by the internal Yandex security team. They also fixed vulnerabilities allowing protocol smuggling (via username) in JDBC connections.
Useful information can be obtained from system tables, they provide access to system status information, available databases, tables and columns, currently running requests (processes table), etc. More about system tables: https://clickhouse.yandex/docs/ru/operations/system_tables/.
ClickHouse’s HTTP interface means that some web application-specific attacks may be relevant for this DBMS.
Reflected File Download
The HTTP interface of ClickHouse may allow a Reflected File Download attack. For that to work, it must meet one of the following conditions:
- There’s no password
- Password was saved in the victim’s browser
- We know the password
Example exploit: http://clickhouse:8123/yandex.bat?query=select+’calc’
As you can see, the response from the webserver is not a 404 error, but an attachment with an attacker-controlled name. Thus, you can download an executable file with any content to the victim’s computer.
As with RFDs, CSRF exploitation is possible when there’s no authorization (or a password is known).
<form action="http://clickhouse:8123/" method="POST" enctype="text/plain" id="test">
<input type="hidden" name="SELECT * FROM system.tables --" value="" />
Another weakness of the HTTP interface is that SSRF attacks on another application can be used to make arbitrary requests to ClickHouse.
GET method will be enough for read-only requests. If authorization is required, the login and password can be passed to the query string: ?username=…&password=…
Sample error-based exploit via the tabular url function for SQL-injection in ClickHouse in a real application:
The result of the subquery is used as the hostname for the HTTP request in the url function. Since there is no such host, the Poco library (used in ClickHouse) throws out an exception that has a result in its text.
SSRF via SQL injection
Obviously, the url function allows you to conduct an SSRF attack through an SQL injection, such access to the AWS EC2 API is demonstrated below:
This time we pass the result of the url function again into the url function, so we see the content of the HTTP response in the error text. It should be noted that the error text is cast to the lower case (you can see it on the substring “result” in the screenshot).
|Current DB||SELECT currentDatabase()|
|List DB||SHOW databases OR SELECT * FROM system.databases|
|List columns||SELECT * FROM system.columns|
|List tables||SELECT * FROM system.tables|
|Concat||SELECT concat(‘one’,’two’) OR SELECT ‘one’||’two’|
|Comment||SELECT 1 /*comment*/ OR SELECT 1—comment|
|Dummy table (dual)||SELECT * FROM system.one|
|Current User||SELECT ‘current_user’,user FROM system.processes WHERE query LIKE ‘%current_user%’|
|Current os_user||SELECT os_user FROM system.processes|
|HTTP request||SELECT * FROM url(‘http://server’, ‘CSV’, col String)|
|Read file||SELECT * FROM file(‘nameFile’, ‘CSV’, col String)|
|Create an array of argument values||SELECT groupArray(x)|
|Concat array of strings||SELECT arrayStringConcat(arr[, separator])|
|Connect to MySQL||mysql(‘host:port’, ‘database’, ‘table’, ‘user’, ‘password'[, replace_query, ‘on_duplicate_clause’]);|
|JDBC connection||SELECT * FROM jdbc(‘jdbc:mysql://localhost:3306/?user=root&password=root’, ‘schema’, ‘table’)|