July 2, 2015 By Aviv Ron 3 min read

Co-authored by Emanuel Bronshtein

Are you using one of the trending NoSQL databases such as MongoDB or CouchDB? Or maybe you are thinking of using one of those but are troubled by how secure they are? We will discuss the security of the application programming interfaces (APIs) and software development kits (SDKs) of NoSQL databases, while also diving into the application code consuming these databases and providing some examples and advice on the risks and mitigations.

NoSQL Databases Still Have Risks

NoSQL, which stands for Not Only SQL, is a common term for nonrelational databases. Among popular NoSQL databases you will find the aforementioned MongoDB and CouchDB, along with Redis, Cassandra and more. NoSQL databases have become increasingly popular thanks to their benefits in particular use cases, especially in big data and real-time Web usages where performance, scalability and flexibility are key.

Database security has been and will continue to be one of the more critical aspects of application security. Access to the database grants an attacker a dangerous amount of control over the most critical information. Although the number of SQL injection vulnerabilities has been declining since 2008 due to use of secure frameworks and improved awareness, it has remained a high-impact risk.

With the emergence of new databases and query techniques, the old attack methods become obsolete and new ones emerge. For example, most NoSQL databases do not use SQL and instead use the JavaScript Object Notation (JSON) query language and an HTTP API. This makes old techniques like SQL injection obsolete. However, NoSQL definitely does not imply zero risk. In fact, NoSQL databases are vulnerable to injection attacks, cross-site request forgery (CSRF) and other vulnerabilities.

In a paper we presented at the Web 2.0 Security and Privacy conference titled “No SQL, No Injection? Examining NoSQL Security,” we demonstrated a number of techniques for injections in different runtimes using MongoDB. Additionally, the paper discusses Web APIs and their risks, such as CSRF, and some deployment recommendations. We recommend reading the paper, which includes actual code samples of possible exploits.

Knowing the risks is key for protecting against them. Having automated security testing is also significant for achieving consistent results. Web application scanners, for instance, can use rules for finding vulnerabilities in NoSQL databases to help you protect against the new exploitation techniques.

Examples of NoSQL Exploits

For those of you who prefer to get more technical, here are a few examples of exploits. More are fleshed out in the full paper.

Consider the following situation: A PHP application has a login mechanism where the username and password are sent from the user’s browser via HTTP POST. This vulnerability is applicable to HTTP GET, as well. A typical POST payload would look like:

username=tolkien&password=hobbit

The backend PHP code to process it and query MongoDB for the user would look like:

db->logins->find(

array(

“username”=>$_POST[“username”],

“password”=>$_POST[“password”]

);

But PHP has a built-in mechanism for associative arrays that allows an attacker to send the following malicious payload:

username[$ne]=1&password[$ne]=1

PHP translates this input into:

array(

“username” => array(“$ne” => 1),

“password” => array(“$ne” => 1)

);

MongoDB keywords start with a dollar sign and, specifically in this example, $ne is the keyword for the not equals operator, which means select all documents where username is not equal to 1 and password is not equal to 1.

To mitigate this issue, cast the parameters received from the request to the proper type, in this case string:

db->logins->find(

array(

“username”=>(string)$_POST[“username”],

“password”=>(string)$_POST[“password”]

);

Another language-agnostic example that is one of the common reasons for SQL injections stems from building the query from string literals, which include user input without proper encoding. The JSON query structure makes this harder to achieve in modern data stores such as MongoDB. Nevertheless, it is still possible. Let’s examine a login form that sends its username and password parameters via an HTTP POST to the back end, which constructs the query by concatenating strings. For example, the developer would do something like:

string query =

“{ username: ‘” + post_username + “‘, password: ‘” + post_password + “‘ }”

With malicious input, this query can be turned to ignore the password, allowing the attacker to log into a user account without the password. An example for malicious input:

Username: tolkien’, $or: [ {}, { ‘a’:’a

Password:‘ } ], $comment:’successful MongoDB injection’

Without encoding, this input will be constructed into the following query:

username: ‘tolkien’,

$or: [ {}, { ‘a’: ‘a’, password: ” } ],

$comment: ‘successful MongoDB injection’

That is, the password becomes a redundant part of the query because an empty query {} is always true and the comment in the end does not affect it. Today, every language has good native libraries for JSON encoding, so there is no reason to construct JSON queries from strings. For more examples and information on other risks such as CSRF, we recommend you read the complete paper.

More from Application Security

PixPirate: The Brazilian financial malware you can’t see

10 min read - Malicious software always aims to stay hidden, making itself invisible so the victims can’t detect it. The constantly mutating PixPirate malware has taken that strategy to a new extreme. PixPirate is a sophisticated financial remote access trojan (RAT) malware that heavily utilizes anti-research techniques. This malware’s infection vector is based on two malicious apps: a downloader and a droppee. Operating together, these two apps communicate with each other to execute the fraud. So far, IBM Trusteer researchers have observed this…

From federation to fabric: IAM’s evolution

15 min read - In the modern day, we’ve come to expect that our various applications can share our identity information with one another. Most of our core systems federate seamlessly and bi-directionally. This means that you can quite easily register and log in to a given service with the user account from another service or even invert that process (technically possible, not always advisable). But what is the next step in our evolution towards greater interoperability between our applications, services and systems?Identity and…

Audio-jacking: Using generative AI to distort live audio transactions

7 min read - The rise of generative AI, including text-to-image, text-to-speech and large language models (LLMs), has significantly changed our work and personal lives. While these advancements offer many benefits, they have also presented new challenges and risks. Specifically, there has been an increase in threat actors who attempt to exploit large language models to create phishing emails and use generative AI, like fake voices, to scam people. We recently published research showcasing how adversaries could hypnotize LLMs to serve nefarious purposes simply…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today