Co-authored by Emanuel Bronshtein
Are you using one of the trending NoSQL databases such as MongoDB or CouchDB? Or maybe you are thinking of using one of those but are troubled by how secure they are? We will discuss the security of the application programming interfaces (APIs) and software development kits (SDKs) of NoSQL databases, while also diving into the application code consuming these databases and providing some examples and advice on the risks and mitigations.
NoSQL Databases Still Have Risks
NoSQL, which stands for Not Only SQL, is a common term for nonrelational databases. Among popular NoSQL databases you will find the aforementioned MongoDB and CouchDB, along with Redis, Cassandra and more. NoSQL databases have become increasingly popular thanks to their benefits in particular use cases, especially in big data and real-time Web usages where performance, scalability and flexibility are key.
Database security has been and will continue to be one of the more critical aspects of application security. Access to the database grants an attacker a dangerous amount of control over the most critical information. Although the number of SQL injection vulnerabilities has been declining since 2008 due to use of secure frameworks and improved awareness, it has remained a high-impact risk.
With the emergence of new databases and query techniques, the old attack methods become obsolete and new ones emerge. For example, most NoSQL databases do not use SQL and instead use the JavaScript Object Notation (JSON) query language and an HTTP API. This makes old techniques like SQL injection obsolete. However, NoSQL definitely does not imply zero risk. In fact, NoSQL databases are vulnerable to injection attacks, cross-site request forgery (CSRF) and other vulnerabilities.
In a paper we presented at the Web 2.0 Security and Privacy conference titled “No SQL, No Injection? Examining NoSQL Security,” we demonstrated a number of techniques for injections in different runtimes using MongoDB. Additionally, the paper discusses Web APIs and their risks, such as CSRF, and some deployment recommendations. We recommend reading the paper, which includes actual code samples of possible exploits.
Knowing the risks is key for protecting against them. Having automated security testing is also significant for achieving consistent results. Web application scanners, for instance, can use rules for finding vulnerabilities in NoSQL databases to help you protect against the new exploitation techniques.
Examples of NoSQL Exploits
For those of you who prefer to get more technical, here are a few examples of exploits. More are fleshed out in the full paper.
Consider the following situation: A PHP application has a login mechanism where the username and password are sent from the user’s browser via HTTP POST. This vulnerability is applicable to HTTP GET, as well. A typical POST payload would look like:
username=tolkien&password=hobbit
The backend PHP code to process it and query MongoDB for the user would look like:
db->logins->find(
array(
“username”=>$_POST[“username”],
“password”=>$_POST[“password”]
);
But PHP has a built-in mechanism for associative arrays that allows an attacker to send the following malicious payload:
username[$ne]=1&password[$ne]=1
PHP translates this input into:
array(
“username” => array(“$ne” => 1),
“password” => array(“$ne” => 1)
);
MongoDB keywords start with a dollar sign and, specifically in this example, $ne is the keyword for the not equals operator, which means select all documents where username is not equal to 1 and password is not equal to 1.
To mitigate this issue, cast the parameters received from the request to the proper type, in this case string:
db->logins->find(
array(
“username”=>(string)$_POST[“username”],
“password”=>(string)$_POST[“password”]
);
Another language-agnostic example that is one of the common reasons for SQL injections stems from building the query from string literals, which include user input without proper encoding. The JSON query structure makes this harder to achieve in modern data stores such as MongoDB. Nevertheless, it is still possible. Let’s examine a login form that sends its username and password parameters via an HTTP POST to the back end, which constructs the query by concatenating strings. For example, the developer would do something like:
string query =
“{ username: ‘” + post_username + “‘, password: ‘” + post_password + “‘ }”
With malicious input, this query can be turned to ignore the password, allowing the attacker to log into a user account without the password. An example for malicious input:
Username: tolkien’, $or: [ {}, { ‘a’:’a
Password:‘ } ], $comment:’successful MongoDB injection’
Without encoding, this input will be constructed into the following query:
username: ‘tolkien’,
$or: [ {}, { ‘a’: ‘a’, password: ” } ],
$comment: ‘successful MongoDB injection’
That is, the password becomes a redundant part of the query because an empty query {} is always true and the comment in the end does not affect it. Today, every language has good native libraries for JSON encoding, so there is no reason to construct JSON queries from strings. For more examples and information on other risks such as CSRF, we recommend you read the complete paper.
Security Researcher, IBM Analytics