A hardware vulnerability, discovered independently by researchers from academia and Google, underscores a microprocessor flaw that, if exploited, could allow an attacker to read data from privileged kernel memory.
This vulnerability is considered an important flaw for complex infrastructures and cloud deployments and must be addressed to prevent potential future impact.
Since this flaw impacts all modern microprocessors, it can affect any device that uses them, including multiple operating systems running on mobile devices, laptops, workstations and servers.
It is important to note that to exploit this vulnerability, a malicious actor would need to execute untrusted code on the physical system or on a virtual machine linked to that system. This may include running content from webpages loaded in web browsers or accessed through mobile apps.
One Flaw, Three Variations
The flaw has three technical variations which were attributed three separate CVEs. Researchers have named two of them “spectre” and one of them “melt down.” Each of those could result in:
- Privilege escalation
- Data leakage from privileged kernel memory
- Patching may result performance degradation
Performance impact will vary in each deployment and case and cannot be quantified in any absolute terms.
Essence of This Flaw
The CPU flaw stems from the way modern processors attempt to optimize performance by speculating about correct processing paths. For example, on most modern systems, memory is stored in three general locations—the processor’s cache, main memory, and on-disk. Each type has different access speeds and storage sizes—for example, the cache is smaller and faster than main memory, which is itself smaller and faster than on-disk memory storage.
This affects how quickly programs can be processed. Programs are not linear—they frequently branch between different possible processing paths. Sometimes the decision on which branch to follow requires information stored in a slow memory space, such as main memory or on-disk.
Rather than idling until the information is retrieved, the processor will often speculate as to which branch will be followed. It will then continue to process this branch until the information is finally retrieved. If it chose the correct branch, it continues processing; if it chose the wrong branch, it flushes the now-incorrect processing results and then follows the correct branch.
Often, speculative execution results in the processor executing instructions before it knows whether the commands violate security protections.
Overall, this CPU vulnerability takes advantage of different aspects of timing in speculative processing, and more specifically, the mis-speculation window when the processor supposedly executes the wrong option but has not yet received the correct path.
Variant #1: Branch Target Injection (CVE-2017-5715, a/k/a spectre)
This variation of the CPU flaw is a new play on an older vulnerability that focuses on branch prediction.
“Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known.” — Intel
(Note that Intel documentation changes over time and quotes may therefore not be as accurate when Intel’s version changes.)
Previously discovered vulnerabilities have proven that it is possible for code running in one security context to influence the branch prediction of code running in a completely different security context. This influence, though possible, only went one way: from the kernel to userspace. For example, from a hypervisor to a guest user.
The novelty in CVE-2017-5715, or branch target injection, is that now, the branch prediction interference can be generated in both directions, meaning that the kernel’s predictions can be affected by an attacker who only has userspace privileges, allowing the attacker insight into data they would not otherwise have.
Variant #2: Bounds Check Bypass (CVE-2017-5753, a/k/a spectre)
The second variant of the flaw relies on the fact that the processor effectively does out-of-bounds code loading during the speculation phase. The purpose of the attacker here would be to trick the CPU to expose its eventual branch choice during the speculation window.
Under normal circumstances, the CPU may indeed read from code it is ‘not supposed’ to execute, but once the correct branch is selected, it rolls back the execution state and discards any effects those other branches would have had.
During a possible attack, the CPU may load an untrusted offset from a caller, start a load from a data-dependent offset, then load the corresponding cache line into the L1 cache. Unlike the code the CPU is expecting, the attacker’s code is longer, which would be out of bounds for that read. That will cause the execution to return to a non-speculative path, at which point the attacker could measure the time required to load data for the different paths and determine which way the CPU went and thus index of the data.
Variant #3: Rogue Data Cache Load (CVE-2017-5754, a/k/a melt down)
The third variation of the CPU flaw at hand aims to read kernel memory from userspace without any influence/misdirection of code running in kernel space.
The technical details of this variant were blogged in July 2017 by Anders Fogh, a computer engineering expert and malware analyst.
In summary, during the speculation window, the CPU checks for the permissions for accessing a memory address, but that check might impose performance impact. To allow for optimal performance, the CPU could opt to check the permissions later, in an asynchronous manner, raising an exception flag only if the check fails for whatever reason.
Since it is also possible to execute an instruction behind a high-latency, mis-predicted branch, to avoid taking a page fault, the speculation window can be widened by increasing the delay between the read from a kernel address and delivery of the associated exception. The result could be allowing an attacker from userspace to read from memory in kernel space without the usual checks that should be in place to limit such option. The exception is not raised until the illegal instructions retire, which under speculative execution they do not.
How to Mitigate Risks Linked With This Flaw?
This new triple-pronged flaw requires a risk assessment process for all organizations. Security teams will have to inventory their assets and determine which ones may be vulnerable. Then, after setting criticality and sensitivity scores, assets should be patched or applied mitigating controls.
An attacker must be able to place code into an application running on the system itself or on a virtual machine attached to the system to use this exploit this vulnerability. Therefore, protections to prevent unauthorized access into systems from outside the infrastructure can serve as a first barrier, as well as existing access controls for internal users.
The most immediate action security teams can take to protect assets is to prevent execution of unauthorized software, or access of untrusted websites, on any system that handles sensitive data, including adjacent virtual machines. Assume that any type of execution, including binary execution, carries the potential for attack.
Also, ensure security policies are in place to prevent unauthorized access to systems and the introduction of unapproved software or software updates.
If the organization is operating environments where preventing execution of unauthorized software is not possible, or is inconsistent, protection may only be possible by applying updates to system firmware, operating systems, and application code, as well as leveraging system-level protections to prevent the execution of unauthorized code.
In cases of update impact issues, mitigating controls should be applied in the interim, but patching is ultimately the remediation needed to prevent potential attacks. Please note that most patches released so far require rebooting systems and must be evaluated for the potential impact of such event on a given asset.
Assess Risk and Take Action
At this time, the asset types for which high remediation priority should be defined are mission-critical, multi-tenant environments, such as:
- XaaS – cloud, virtualizations, and externally facing services.
- Appliances.
- Supply chain with automated update capabilities.
- Application servers running applications with file upload capabilities or JumpHost.
- Application development environments.
- User devices of all types.
Remediation priority can be set to moderate for mission-critical single-tenant environments, as well as non-mission-critical multi-tenant environments, such as:
- Internal clouds that do not face externally.
- Privileged users, users with access to private/sensitive data, developers.
- Infrastructure with limited access, such as databases and application servers without file upload capabilities.
Remediation priority can be set to low in cases of non-mission-critical, single-tenant environments, such as:
- Shared Infrastructure with limited access.
- Network and security appliances.
- Internal appliances.
Potential for Performance Degradation
In updating assets, and in some cases having to update BIOS, significant performance impact may result. The level of impact will depend on the specific processor used, the nature of the workload, and the remediation method selected by the manufacturer.
Security teams should assess the exposure and potential impact to their environments before proceeding with updates that could negatively impact performance.
To better understand the potential for performance impact to specific application environments, security teams are encouraged to seek information from their application vendors. In all cases, it is recommended to deploy required updates into a validation environment for testing before proceeding general deployment plans.
Check for Patches Issued by Manufacturers
All the relevant manufacturers are going to be issuing and releasing remediation elements and patches to address this new CPU vulnerability. Security teams should check manufacturer resources to obtain patches and updates as they become available.
Security teams should look to receive and test microcode patches as well as operating system patching for underlying host and Guest.
Check Cloud Provider Remediation
Keeping in mind common cloud security models that share security responsibilities between the vendor and their customers, organizations with cloud deployments should check into the remediation coverage offered by their cloud service provider. Defining what each side must undertake to address remediation will result in better mitigation.
Using IBM BigFix to Remediate
IBM BigFix can be used to discover and inventory systems within enterprise environments and, as vendors and manufacturers release remediation elements and patches, can facilitate the patching and remediation process for this flaw, as it does for any other vulnerability.
BigFix administrators can create powerful custom patching tasks, procedures and policies, known as “Fixlets,” to automate the identification of vulnerable systems as well as deploy vendor-supplied patches and remediation elements, which can then be shared with the entire BigFix community.
Further Information from IBM
For additional information, please review the “Understanding the CPU Vulnerability” infographic. IBM customers are also invited to access the X-Force Collection.
Principal Consultant, X-Force Cyber Crisis Management, IBM