The State of Return Oriented Programming in Contemporary Exploits

Return Oriented Programming (ROP) is the general case of a technique often used when exploiting security vulnerabilities caused by memory corruption issues. In the past, techniques that fall into the general ROP category have been referred to as “return to libc” or “return to PLT” techniques etc., depending on the specific circumstances in how ROP had been used. These days, due to the widespread implementation of exploit mitigation technologies such as non-executable stack and non-executable heap in popular operating systems, ROP has become a more frequently used technique in the exploitation of memory corruption vulnerabilities.

So why Return Oriented Programming (ROP)?

Memory corruption security vulnerabilities occur when a privileged program is coerced into corrupting its own memory space, such that the memory areas corrupted have an impact on the secure functioning of the program. The privileged program may be coerced into such a state by an unprivileged user (attacker) by supplying the program specifically crafted data as input, which triggers a particular flaw in the program. Such memory corruption could result in the program overwriting areas of its own memory in such a way that the privileged program would perform actions on behalf of the unprivileged user – actions which the unprivileged user has no privileges to perform on the system.

Or it could be worse – some of the specifically crafted data that is supplied by the attacker might be misinterpreted as executable code due to the memory corruption, eventually being executed in the context of the privileged program! Such executable code that does the attackers bidding, embedded as a part of the specifically crafted data supplied as input by the attacker is called shellcode.

In this latter case, the unprivileged attacker has more complete control over the privileges the system grants the privileged program since the attacker can craft the shellcode to exercise any such privileges. Code size limitations for such attacker supplied code are common, but this limitation is circumvented relatively easily in general by using multi-stage shellcode where the first stage of shellcode loads subsequent stages of shellcode into memory. Since a memory corruption flaw is used to divert the execution flow of the program, such that this shellcode is executed by the program, non-executable stack and non-executable heap exploit mitigation technologies in modern operating system mitigate against this.

By marking any areas of privileged program memory which is likely to contain unprivileged user (attacker) supplied data as non executable, the execution of shellcode is thwarted because when an attacker manages to divert the privileged program flow to execute shellcode, the shellcode along with the rest of program data is likely to be on the stack or in the heap. Memory areas, which will now be non-executable and will trigger a fault which the operating system can deal with, thanks to non-executable stack and non-executable-heap mitigation technologies.

ROP to circumvent mitigations

Now ROP can be used to circumvent these particular mitigations. ROP makes use of actual executable code sequences, called ROP gadgets, in the program memory space (marked as executable memory). Rather than the attacker supplying executable code (shellcode) in the attacker supplied data, the attacker now supplies a sequence of data and return addresses, called a ROP chain. Then instead of diverting the program execution flow to the attacker supplied data (which can no longer be executable), the attacker diverts the program execution to an existing sequence of code in the program memory space which achieves the following:

  • Load the stack pointer with the memory location (address) of the attacker specified ROP chain. At this point, the program stack where the return addresses from calls are stored is hijacked by the attacker.
  • Optionally perform an action the attacker wants performed.
  • Return to the return address (to the next ROP gadget) specified in attacker supplied the ROP chain.

The next ROP gadgets can be of the following form:

  • Perform an action the attacker wants performed.
  • Perform actions the attacker doesn’t care about because it is neither an action the attacker wants performed nor it is an action that hinders what the attacker is trying to achieve.
  • Return to the return address, the next ROP gadget in the ROP chain.

Eventually, by using a sequence of such ROP gadgets, each performing an action that ultimately adds up to the overall action the attacker wants performed by the privileged program on the attackers behalf, the attacker achieves successful exploitation! More complex and harder to implement than the straightforward shellcode, but possible all the same. Since the only code executed is code already present in program memory (ROP gadgets) marked as executable code and the ROP chain itself is not executed as code, the non-executable stack and non-executable heap mitigations are circumvented.

But this is not the end of the story. Modern operating systems also implement a mitigation technology called ASLR (Address Space Loader Randomization) which makes it very difficult for an attacker to construct a ROP chain with the appropriate return addresses which point to the appropriate ROP gadgets. The reason is ASLR technology adds randomness to the location (address) where code is loaded in program memory space such that the address of the executable code sequences (including any used by the attacker as ROP gadgets) are not static and is difficult to deduce for the attacker.

But this is not a complete solution. Security vulnerabilities classed as information leakage/disclosure vulnerabilities may give the attacker the necessary information which allows the attacker to successfully deduce the addresses of any code sequences in the program memory space the attacker wishes to use as ROP gadgets! Even though memory content leakage/address disclosure vulnerabilities in themselves are considered less severe, they could well hold the key to the successful exploitation of a memory corruption vulnerability and as such not to be taken lightly.

So with the mitigation technologies so far, ROP can still be used if the randomness introduced by ASLR can be circumvented using an information disclosure vulnerability (or a flaw in the ASLR implementation itself for that matter).

Many contemporary exploits only use ROP up to the point where it can either mark the memory where shellcode resides (using VirtualProtect API function in Windows for example) as executable or allocate executable memory and load shellcode into it, before running shellcode. This is because shellcode is easier to construct than ROP chains, in general.

Now this brings us to the latest and the greatest implementations of technologies that are used for ROP detection/mitigation. Some of these techniques are implemented by Microsoft Enhanced Mitigation Experience (EMET) Toolkit for Microsoft Windows. In fact, Microsoft held a contest in 2012 in conjunction with its Bluehat security conference, open to submission of mitigation technologies that specifically deal with ROP. Subsequently, Microsoft implemented the mitigation technology proposed by the second place winner of the contest, Ivan Fratric, in Microsoft EMET (as of version 3.5 Technical Preview). The technique itself was called ROPGuard.

So what are some of the mitigation techniques ROPGuard proposes?

At appropriate points, such as at the entry to operating system API functions that are considered very useful when using ROP in exploits (such as VirtualProtect or LoadLibrary API functions in Windows) and other security critical API functions:

  • Verify the stack pointer points to the stack area assigned to the program thread. This way if the stack is switched to an attacker supplied ROP chain, that would be detected. This is rather straightforward.
  • Verify the return address to return from such a function is preceded by a call to that function and not what looks like an unrelated piece of code which could be a ROP gadget. This is relatively easy when working with compiler generated code where the compiler(s) used are known, since how an API call (via the import table) is performed by a given compiler is relatively consistent and takes only a few forms. But this can be much more complicated when working with hand optimized code or some virtual machine/interpreter/JIT compiler generated code where an effective call can be codified in many forms.

The implementation also checks for some of these conditions to be valid for stack frames that are not the current stack frame (for example by using the frame pointer).

ROPGaurd also proposes to disallow changing the executable status for non-executable memory areas (such as the stack and heap).

Another very interesting mitigation technology, called kBouncer was proposed by Vasilis Pappas. In fact this was the first prize winner, receiving a $200,000 prize. How does kBouncer work? Well, kBouncer requires a kernel mode component (where ROPGuard could be implemented in user mode alone). It makes use of the LBR (Last Branch Recording) feature of x86 / x64 processors and requires enabling this feature for execution returns from calls. Using the LBR feature has very little overhead compared to the BTS (Branch Trace Store) feature of the same processor architecture. LBR support is present since the Intel Pentium 4 processors.

So with kBouncer, when an operating system call takes place (for example a call to the kernel implementation of the Windows Native API), mitigation technology implementation code in the kernel reads LBR stack (which records 4 – 16 control transfer pairs depending on the CPU) and makes sure the return addresses are preceded by a calls.

The mitigation technologies however are not without circumventions.

kBouncer mitigation technology for example can be circumvented with ROP code that doesn’t use gadgets with return (RET on x86/x64) instructions. For example, a ROP gadget could pop (POP instruction)  the next ROP gadget address off the stack and jump (JMP instruction) to that address instead, in which case the LBR stack will not reflect this control transfer.

These technologies however do raise the bar. You can find a further in-depth discussion on circumventing ROP mitigation by not using return (RET) instructions here.

Share this Article:
Nishad Herath

Researcher, IBM X-Force

Nishad Herath has been involved in the information security community close to two decades. Based in Australia, he is a part of the IBM X-Force Advanced Research group.