August 26, 2013 By Yong Chuan Koh 11 min read

As I was looking through my old projects, I came across this old DOS vulnerability in BIND9: CVE-2011-4313. This caught my attention because rather than the typical case of parsing error from malformed fields, the root cause for this crash arises because of BIND9’s cache implementation for DNS responses whereby the previous responses can be abused. Since this was disclosed some time ago, I decided it is now appropriate to share some technicalities. The following code snippets are based on v9.7.1-P2, an affected version. The reader is assumed to be familiar with the DNS packet format.

Before proceeding, we note that BIND9 stores DNS records in a similar manner:

In a nutshell, each DNS record type is stored as a dns_rdata_t structure. Similar DNS record-types are then collectively grouped together and headed by a dns_rdataset_t structure. The dns_rdataset_t stores group information such as type, class, covers, etc. The rdatasetheader_t also stores similar information as dns_rdataset_t.

Now we are all set to examine the vulnerability. First off, we looked at the official ISC advisory:

“…Organizations across the Internet reported crashes interrupting service on BIND 9 nameservers performing recursive queries. Affected servers crashed after logging an error in query.c with the following message: “INSIST(! dns_rdataset_isassociated(sigrdataset))” …” “…An as-yet unidentified network event caused BIND 9 resolvers to cache an invalid record, subsequent queries for which could crash the resolvers with an assertion failure.…”

The advisory offers 2 important clues:

  1. The DOS assertion occurs in query.c with the message “INSIST(! dns_rdataset_isassociated(sigrdataset))”
  2. BIND9 has to first cache an invalid record. Subsequent queries for the record would then trigger the assertion.

One advantage of working on open-source applications is the availability of, well, the source code. In query.c, the only function having the assertion is query_addadditional2().

static isc_result_t
query_addadditional2(void *arg, dns_name_t *name, dns_rdatatype_t qtype)
{
                …
                result = dns_db_findrdataset(db, node, version, dns_rdatatype_a, 0,
                                                                     client->now, rdataset, sigrdataset);
                …
                if (result == DNS_R_NCACHENXDOMAIN)
                                goto setcache;
                if (result == DNS_R_NCACHENXRRSET) {
                                dns_rdataset_disassociate(rdataset);
                                /*
                                 * Negative cache entries don’t have sigrdatasets.
                                 */
                                INSIST(! dns_rdataset_isassociated(sigrdataset));      //author: crash here
                }

                …
}

This function is called when BIND9 is crafting the A record in the ADDITIONAL section from the cache, for a subsequent query. Clearly, to trigger the assertion, dns_db_findrdataset() has to return DNS_R_NCACHENXRRSET with sigrdataset != NULL. So now we take a closer look at cache_findrdataset(), of which dns_db_findrdataset() is a wrapper of.

static isc_result_t
cache_findrdataset(dns_db_t *db, dns_dbnode_t *node, dns_dbversion_t *version,
                                   dns_rdatatype_t type, dns_rdatatype_t covers,
                                   isc_stdtime_t now, dns_rdataset_t *rdataset,
                                   dns_rdataset_t *sigrdataset)
{
                …
                matchtype = RBTDB_RDATATYPE_VALUE(type, covers);
                negtype = RBTDB_RDATATYPE_VALUE(0, type);
                if (covers == 0)                                                    
                                sigmatchtype = RBTDB_RDATATYPE_VALUE(dns_rdatatype_rrsig, type);
                … 
                for (header = rbtnode->data; header != NULL; header = header_next) {
                                …
                                } else if (EXISTS(header)) {
                                                if (header->type == matchtype)
                                                                found = header;  
                                                else if (header->type == RBTDB_RDATATYPE_NCACHEANY ||
                                                                 header->type == negtype)
                                                                found = header;       //author: we need the found entry to be negative
                                                else if (header->type == sigmatchtype)

                                                                foundsig = header; //author: we need a RRSIG for found entry
                                }

                }
                if (found != NULL) {
                                bind_rdataset(rbtdb, rbtnode, found, now, rdataset);//author: “convert” rdatasetheader to rdataset
                                if (foundsig != NULL)

                                                bind_rdataset(rbtdb, rbtnode, foundsig, now,//author: “convert” rdatasetheader to rdataset
                                                                      sigrdataset);

                }
                …
                if (RBTDB_RDATATYPE_BASE(found->type) == 0) { 
                                /*                                                                            
                                 * We found a negative cache entry.
                                 */
                                if (NXDOMAIN(found))

                                                result = DNS_R_NCACHENXDOMAIN;
                                else
                                                result = DNS_R_NCACHENXRRSET;   //author: we need to return this, for the crash
                }

                return (result);
}

In this function, BIND9 walks the link-list to find the rdatasetheader and sigrdatasetheader for the target matchtype, negtype and sigmatchtype. The rdatasetheader and sigrdatasetheader are then “converted” to rdataset and “sigrdataset. Finally, if rdataset is a negative cache entry, then it returns DNS_R_NCACHENXRRSET if the RDATASET_ATTR_NXDOMAIN bit is not set.

At this instance, we note that query_addadditional2() calls cache_findrdataset() with these arguments:

  1. type = 1
  2. covers = 0

Which would result in these variables:

  1. matchtype = (covers << 16) | type = 0x01
  2. negtype = (type << 16) | type = 0x010000
  3. sigmatchtype = (type << 16) | dns_rdatatype_rrsig = 0x01002E

So here, 2 crucial questions result:

  1. How does BIND9 cache a record such that it has an rdatasetheader with a negative type==0x010000?
  2. Under what circumstances would BIND9 cache a negative cache entry TOGETHER with an accompanying RRSIG record (by default, RRSIG would not be cached for negative cache entry)?

To answer the first question, we note that BIND9 initializes the rdatasetheader->type value during the additional of records into cache, in addrdataset().

newheader->type = RBTDB_RDATATYPE_VALUE(rdataset->type, rdataset->covers);
                                       = ((rdataset->covers) << 16) | (rdataset->type)

Which means that the record to be cached should have:

  1. rdataset->covers = 0x01
  2. rdataset->type = 0x00

The only way for rdataset->type == 0x00 is for BIND9 to cache a negative entry. This, again, verifies the advisory of “…BIND9 has to first cache an invalid record…

isc_result_t
dns_ncache_addoptout(dns_message_t *message, dns_db_t *cache,
                                     dns_dbnode_t *node, dns_rdatatype_t covers,
                                     isc_stdtime_t now, dns_ttl_t maxttl,
                                     isc_boolean_t optout, dns_rdataset_t *addedrdataset)
{
                …
                ncrdatalist.type = 0;                                      
                ncrdatalist.covers = covers;                       
                …
}

As we follow the caller of addrdataset()/dns_ncache_addoptout(), we also see that rdataset->covers is determined by the DNS query type in resquery_response().

static void
resquery_response(isc_task_t *task, isc_event_t *event)
{
                …
if (WANTNCACHE(fctx)) {
                                dns_rdatatype_t covers;
                                if (message->rcode == dns_rcode_nxdomain)
                                                covers = dns_rdatatype_any;
                                else
                                                covers = fctx->type; //author: this is the DNS query type

                                /*
                                 * Cache any negative cache entries in the message.

                                 */
                                result = ncache_message(fctx, query->addrinfo, covers, now);

                }
                …
}

The astute reader may have wondered about the impact of rdatasetheader->type having the RBTDB_RDATATYPE_NCACHEANY (0xFF0000) value in cache_findrdataset(). This would not trigger the assertion for 2 reasons:

  1. In resquery_response(), this is assigned when the NXDOMAIN bit is set. But this would not return DNS_R_NCACHENXRRSET in cache_findrdataset()
  2. In add(), if rdataset->covers == 0xFF, then all other rdatasetheader would be cleared, including the RRSIG rdatasetheader needed (see rest of article)

static isc_result_t
add(dns_rbtdb_t *rbtdb, dns_rbtnode_t *rbtnode, rbtdb_version_t *rbtversion,
    rdatasetheader_t *newheader, unsigned int options, isc_boolean_t loading,
    dns_rdataset_t *addedrdataset, isc_stdtime_t now)
{
                …
                rdtype = RBTDB_RDATATYPE_BASE(newheader->type);
                                if (rdtype == 0) {
                                                /*
                                                 * We’re adding a negative cache entry.
                                                 */
                                                covers = RBTDB_RDATATYPE_EXT(newheader->type);

                                                if (covers == dns_rdatatype_any) {
                                                                /*
                                                                 * We’re adding an negative cache entry
                                                                 * which covers all types (NXDOMAIN,
                                                                 * NODATA(QTYPE=ANY)).
                                                                 *
                                                                 * We make all other data stale so that the
                                                                 * only rdataset that can be found at this
                                                                 * node is the negative cache entry.
                                                                 */
                                                                …
}

The last piece to the puzzle is to find out how to link a RRSIG rdatasetheader to the negative-cached record. It turns out to be quite simple. For example, after receiving a “NS” response with an A record in the ADDITONAL section, and both “NS” and “A” have RRSIG records, the cache would look like this:

rdatasetheader_1contains the RRSIG record for the “A” record in rdataheader_2. Similarly for rdatasetheader_3 and rdatasetheader_4. Recall that earlier, we’ve seen that the rdataset->type should be 0x01. Therefore, this has to be an “A” query with a negative response. An example of negative response would be one with no ANSWER records. Therefore subsequently, if BIND9 receives a negative response for an “A” query, then it would update “A” record in the list to a negative type without removing the original RRSIG record, as shown below:

So there we have it; the conditions required to trigger the crash. In this state, when this BIND9 receives another “NS” query, it would look into its cache to form the response. This is the key to the problem. As it prepares the ADDITIONAL section in query_addadditional2(), rdataset_2 and rdataset_1 would be found and DNS_R_NCACHENXRRSET returned. Thus triggering the assertion. Implementation bugs like this are tricky to find because we need to know how the internals work. I hope everyone had as much fun looking at this vulnerability as I do. Lastly if anyone has any idea on how else to trigger this, drop me a note!

More from X-Force

Strela Stealer: Today’s invoice is tomorrow’s phish

12 min read - As of November 2024, IBM X-Force has tracked ongoing Hive0145 campaigns delivering Strela Stealer malware to victims throughout Europe - primarily Spain, Germany and Ukraine. The phishing emails used in these campaigns are real invoice notifications, which have been stolen through previously exfiltrated email credentials. Strela Stealer is designed to extract user credentials stored in Microsoft Outlook and Mozilla Thunderbird. During the past 18 months, the group tested various techniques to enhance its operation's effectiveness. Hive0145 is likely to be…

Hive0147 serving juicy Picanha with a side of Mekotio

17 min read - IBM X-Force tracks multiple threat actors operating within the flourishing Latin American (LATAM) threat landscape. X-Force has observed Hive0147 to be one of the most active threat groups operating in the region, targeting employee inboxes at scale, with a primary focus on phishing and malware distribution. After a 3-month break, Hive0147 returned in July with even larger campaign volumes, and the debut of a new malicious downloader X-Force named "Picanha,” likely under continued development, deploying the Mekotio banking trojan. Hive0147…

FYSA – Critical RCE Flaw in GNU-Linux Systems

2 min read - Summary The first of a series of blog posts has been published detailing a vulnerability in the Common Unix Printing System (CUPS), which purportedly allows attackers to gain remote access to UNIX-based systems. The vulnerability, which affects various UNIX-based operating systems, can be exploited by sending a specially crafted HTTP request to the CUPS service. Threat Topography Threat Type: Remote code execution vulnerability in CUPS service Industries Impacted: UNIX-based systems across various industries, including but not limited to, finance, healthcare,…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today