Security management can be proactive or reactive depending on each organization’s risk appetite. When attacks are made public, things change, and learning from threats becomes a requirement for both C-suite members and security leaders.
WannaCry, NotPetya and Industroyer are some of the most recently analyzed malware pieces. Apart from corporate networks, all three had an impact on the industrial domain of operational technology (OT) networks. Let’s take advantage of the security information released in the aftermath of such attacks and walk through some aspects of what we can learn from the incidents, as well as explore security recommendations related to these and similar cases.
Disruptive Attacks and Their Impact
All three aforementioned cyberattacks were able to affect some of their victims’ OT environments in a way that hindered industrial processes in one way or another. The direct impact in some of the incidents explored by IBM teams ranged from lost energy distribution to clients to lost production. Let’s go through these attacks — from the simple to the complex, from nuisance to actual danger.
The WannaCry ransomware attack emerged in May 2017 and initially seemed to focus on medical and corporate networks in hospitals within the U.K. On the night of the WannaCry breakout, it spread to over 150 countries. Several industrial plants were hit and suffered varying down times across Europe and Asia.
The infection vector was initially unknown, but soon enough it became evident that the malware was scanning IPs on the internet and spreading like wildfire to networks that had the server message block (SMB) network protocol open to external connections. For spreading and system infection, the malware used a known vulnerability in the Microsoft Windows operating system and borrowed from a leaked trove of exploits stolen from the NSA. Remarkably, a patch was already released in March 2017, but many companies did not update their systems in time.
The WannaCry ransomware disrupted operations by holding critical files hostage in exchange for payment. To collect payments, WannaCry has a sophisticated and anonymous payment process, offering a relatively low price for releasing hijacked data. Its sophisticated encryption and payment infrastructure mistakenly led experts to the conclusion that the attack was financially motivated. Alas, this was not the case: Investigators gradually came to the conclusion that WannaCry was meant to destroy and disrupt — and not just collect bitcoin.
At end of June, the next worming disruptive malware attack appeared. It was coined NotPetya for its resemblance to previously known ransomware code.
NotPetya started by hitting several government agencies and companies in Ukraine before it spread across Europe and then the U.S. The malware was hidden in a software update of a popular tax and accounting software used in Ukraine, a strong indication it was a geographically targeted attack.
Critical infrastructure was the top target of the attacks. For example, the automated radiation sensor measuring and alert systems at the Chernobyl nuclear plant had to be switched to older technology for at least an entire day. It was also reported that some steel, oil and gas, and construction companies in Ukraine, Russia and Poland had been affected by the attack’s disruptive effects.
NotPetya not only used the same infection path through unpatched vulnerabilities as WannaCry, but it also contained some additional propagation methods to move rapidly through infected networks. When searching for local targets, it greatly reduced the amount of suspicious network traffic, making use of legitimate administration tools to capture Windows credentials and remotely execute the malware. This way, even patched systems could be infected and used for further propagation over the local network.
Once infected, the malware worked like a destructive hard drive locker and intentionally rendered systems unusable by encrypting the hard drives’ Master Boot Record (MBR). Apart from the lack of a robust payment process, which relied on a single public email address, the encryption-decryption function within the malware was not working. Much like WannaCry before it, NotPetya’s mission was data destruction rather than financially motivated cybercrime.
Industroyer, also known as CrashOverride, was a highly targeted attack that hit a single, 200-megawatt Ukrainian transmission substation in late December 2016. The attack caused the Kiev-based substation to go offline and prevented operational teams from controlling it for at least an hour, leading to a power outage in the Ukrainian capital that lasted several hours right before Christmas Eve.
Different parts of the malware have been investigated since, and two relevant reports were released by security vendors by mid-June 2017: “Win32/Industroyer: A New Threat for Industrial Control Systems” and “CrashOverride: Analysis of the Threat to Electric Grid Operations.”
This piece of malware is considered targeted, specific to ICS systems and more significant in its physical effects than WannaCry and NotPetya. Industroyer’s attack platform is modular, reusable and adaptable, which are strong signs of attack sophistication. Perpetrators targeted operational equipment and electrical grid operation systems in a vendor-neutral way.
Industroyer also represents the most targeted attack of all three examples, even though some experts think it was only a proof of concept. The attack’s motive appears to have been the gathering of information, and the ultimate goal was likely to stop an industrial process or destroy the equipment.
The case of the Industroyer malware showed that attackers are slowly gaining a foothold in targeted attacks on industrial systems and processes. The payload contained valid control commands and was able to interrupt substation operations. There are indications that the code has even more serious capabilities, from providing wrong values to the operation center (as Stuxnet did) to forcing the substation into a fail-safe mode to denial-of-service (DoS) on relays, which could impact safety measures. You should be scared now.
Overall, Industroyer targeted equipment and processes deep in the grid’s operational network. Unfortunately, the initial attack vector remains unknown to this day.
What Can We Learn From These Recent Disruptive Attacks?
WannaCry, even more than NotPetya, relied on unpatched vulnerabilities for a successful system attack. But while timely patching may seem to be the natural response to this threat, the proper security approach can be a lot less evident for OT networks.
Industrial equipment is typically vendor-certified to perform processing near real time. Any untested change can have unwanted impact on operational processing capabilities. In extreme cases, a well-meaning security measure can cause unacceptable risk to the operational process, and therefore some of the typical measures applicable in the IT world can be rather ineffective for OT.
To cover malware-based risk, it is important to look to a multilayered security approach, which involves:
- Covering all security phases from predictive to recovery measures;
- Covering security measures from processes to technology;
- Covering technology layers from network to endpoint protection; and
- Reaching into the OT network, from the supervisory control and data acquisition (SCADA) DMZ to at least the PLC/RTU level or even down to sensor/actuator level.
Even if an OT network is air gapped from the corporate network, people and files still cross that gap. Security needs to be strong around the air gap, and connections here should constantly be monitored and analyzed. The mantra should be “trust but verify.”
With Industroyer we saw a high level of robustness and efficiency of the attack method. And low-level controllers — far below SCADA systems — were the target. We also saw that attackers are getting more familiar with industrial network specifics. They used industrial protocols for attacking with Industroyer: IEC 60870-5-101, IEC 60870-5-104, IEC 61850 and OLE for Process Control Data Access (OPC DA).
Industroyer first gathered information on specific grid operations including industrial control system (ICS) assets, their communication and configuration, related files and variables within the Remote Terminal level (RTU). Once nested in as system process, valid control commands were passed to connected circuit breakers and switches. These components were used to disconnect substations from the transmission grid or distribution lines. When disconnected, the lights turn off.
Even if security controls were in place, they would only cover upper OT network, such as SCADA systems and maybe remote access systems. Once in the deeper trust chain, these commands would be processed without further validation, which could then lead to process outage in a best-case scenario.
What security research indicated is that Industroyer’s modular platform leveraged industrial communication protocols used in critical infrastructure. Another worrying fact is that the malware’s modularity allows it to add functionality and extend to other industrial networks.
The electrical grid’s operational process is relatively robust. Substations can be run in a manual operation mode if there are physically enough engineers to be on-site in parallel. A good analogy can be found within the aviation industry: Autopilot in an airplane is disabled from time to time to run all procedures manually. Pilots train regularly to be able to fly manually at any time. Would your organization’s engineers be able to run the OT process manually at any time? When was that last tested? What did your organization learn?
Over all, the three threats described speak to the increased sophistication of attacks on critical infrastructure. They also prove that if forensic investigation capabilities are lacking, identifying the attack’s entry path and emergence time can be hindered, which may prevent further security investigation. Increasing threat evolution is forcing companies to continuously re-evaluate and improve the processes and procedures that integrate technology into security.
Security Recommendations and Countermeasures
When it comes to cyberattacks, actors always have a certain advantage: They only need to find one way in while security has to find and close all the ways in. How do we deal with this unbalanced situation?
Good security practices already provide a robust and effective approach to handling the attacks mentioned in this writing, at least to some degree. In relation to the point in time an attack is handled, security practices distinguish between predictive, preventive, detective and corrective controls. If it is too late for any of these, one may be left with the last and most costly stage of reaction: recovery.
There is no shortage of security solutions suited to particular phases or stages of the security cycle. Some cover more than one stage, and some clever and effective solutions integrate and automate several of these stages.
Having security policies and processes in place is what makes technology effective and sustainable, because security technology alone will wear out without properly monitored underlying processes. This is nowhere more apparent and critical than in the practical application of IT security tools within an OT environment. The need to balance the OT perspective when deploying, or most often modifying, security practices requires close cooperation between classic IT skills and operational environment stakeholders. Key for an effective security defense is the robustness and coverage of all these areas.
There is no silver bullet. It’s neither efficient nor effective to have a super-duper preventive measure in place but then fail with detective measures. Even if one has all that set up, it can still be insufficient without backup and recovery in place. And if those backups have never been actually tested along with the process and the technology, issues will probably surface at the worst possible moment.
Protecting the OT Network
When it comes to OT environments, there are rather specific security requirements that come to mind. Any active security measure can be considered a significant risk to the operational process’ availability or its timeliness, and when it comes to security patching, additional challenges are piled on for OT systems.
Because of real-time operation requirements, a patch has to be checked and approved by the component’s vendor first to make sure it will not have problematic side effects on the processing capabilities of the component. Attempting to update an industrial component without vendor collaboration often leads to loss of vendor reliability, vendor support and system certification.
But, OT environments can be better protected. Here’s one approach to reducing the risk:
- Due to the process disruption potential, often well-established IT security practices can’t be implemented in OT environments. However, compensating measures can be used to mitigate the risk into the organization’s risk appetite levels. This means that if some controls do not apply, different security controls should be considered.
- Since modern OT environments leave us with less security options to choose from, the ones we can use must be as effective as possible and attain their predefined objectives with consistency.
Mitigating the Impact
Let’s review how the impact of attacks like WannaCry, NotPetya and Industroyer can be mitigated. If no specific attack is mentioned, the recommendation is valid for all three types.
Based on the industrial security standards ISA-99 and IEC 62443, communication should be reduced to the least privilege level. Network-level segmentation is probably the first measure that should be established. Even better would be the parallel coverage in a security policy for IT and OT, aligned with each other and regularly reviewed.
Apart from segmentation, establishing security classification and baselines for OT should be considered. Only systems within a specified security level would be allowed to communicate; traffic beyond that levels would be closely monitored. Additionally, taking inventory of all IT/OT assets, their owners, vendors, firmware level, locations, configuration backup dates and storage is important for proper risk analysis.
Some of the preventive measures for limiting the impact of worming attacks like WannaCry and NotPetya would have been system patching upon patch availability. For IT networks, relevant patches can be distributed from a central platform, but depending on the list of assets, this approach may only apply to parts of the OT network.
For OT components, IBM has licensed endpoint protection technology to our partner Verve Industrial, offering patch management for OT systems in a nondisruptive and agentless way. This protection reaches down to the RTU/PLC level and supports a large number of vendors. Both BigFix for IT and Verve for OT integrate with IBM’s QRadar SIEM, and in combination the two may be able to help organizations:
- Detect endpoints to identify assets in scope.
- Patch vulnerabilities.
- Detect and disable vulnerable services.
- Identify vulnerable systems and help take preventive actions based on the actual threat landscape.
- Reduce permissions for privileged domain accounts and service accounts, and remove administrator rights from standard user accounts on workstations.
- Change standard passwords of industrial components.
Detection is usually managed with a central SIEM system that collects and analyses various system logs, correlates the data and raises alarms based on (pre)defined rules. Some point solutions or network flow analysis can also help with detection.
BigFix can help identify advanced, evasive threats — even new, unknown and zero-day threats — by identifying anomalous process behavior that is typically indicative of evasive and malicious activity for IT. Verve does similar for OT equipment. This sort of endpoint protection combo could have helped identify WannaCry based on actions the malware performed to prevent recovery, which are highly indicative of ransomware activity.
NotPetya performs suspicious behavior as it moves laterally through the network, exploits vulnerabilities, tries to obtain passwords, and communicates with untrusted internet sources. This may raise an immediate alert in an out-of-the-box SIEM system.
Industroyer would likely have raised alerts on several communication attempts as well:
- The installed proxy waiting for requests may have been detected.
- Any test of the proxy server with communication to an external IP would have been suspicious.
- Local communication between the malware backdoor module and an unknown internal proxy may have raised an alert.
- Malware command exchange between the internal proxy and external command-and-control instance may have been detected. Communication with a TOR node may have raised an alert in an out-of-the-box SIEM system using data feeds.
- Verve customers are able to configure and forward whitelisting and change management alerts to QRadar SIEM, thus extending the detective capabilities far into the OT realm.
At some of the attack stages of NotPetya and Industroyer, when files were moved, stored or overwritten, a File Integrity Monitoring (FIM) solution may have also raised a red flag.
A virtual patch appliance is capable of helping detect and potentially prevent attacks that target known vulnerabilities. WannaCry and NotPetya infection and network access attempts may have been identified by this type of solution. Integrating with a SIEM could result in an alert, combined with the option to stop and help prevent any further access attempts.
IBM’s and Verve Industrial’s endpoint solution can help detect service changes and actively respond to such changes with a service restart. While BigFix does this near real time, Verve has proven over the years to be operational in real time.
Sophisticated and targeted attacks compromising trust relations might not be prevented, but fast detection, forensic investigation and response can help reduce the cost of a breach significantly. At some point in the security journey, an investment in incident response is the most valuable. It helps with:
- Investigating and identifying an attack;
- Identifying the date of the compromise, which is important for selecting the right backup for recovery; and
- Identifying the attack’s path and potentially compromised systems, which allows for containment, mitigation and further investigation.
QRadar Network Insights (QNI) can load YARA detection rules that have been released for all three attacks. QNI tightly integrates with QRadar Incident Forensics, which helps enable a security analyst to perform post-incident investigation and threat hunting activities on historical data. Incident forensics is maybe your first, last and only option to review and identify a breach.
Endpoint solutions support an array of responsive measures, from device and file quarantine (WannaCry, NotPetya), to registry fix (NotPetya, Industroyer) and process kill (Industroyer).
Response plans should be in place and regularly tested to train responders, reduce downtime and ensure continuous improvement in responding to incidents.
Make sure, logs, configurations, firmware and data are securely backed up. The backup should be archived and designated staff must be trained in the recovery process, which has to be tested at regular intervals.
The following graphic summarizes the stages of the security cycle for protecting infrastructures from disruptive threats like those mentioned above.
For WannaCry and NotPetya, there are limited options to prevent an infection and the internal spread in OT environments. But all communication attempts, even if minimal, could have potentially been detected and raised an incident alert, especially in cases where a SIEM was in place and continually monitored.
For Industroyer, the infection path may not be entirely clear, but the number of the steps it took to plant the malware on the targeted substation’s systems could have been identified. This detection would not have been not preventive in nature, but could have happened early enough in the attack kill chain to help prioritize the telling signs and respond more quickly.
Sophisticated attacks, in general, might not all be preventable, but we must keep in mind that several aspects can raise an alert and can thus be handled on time if the organization has some standard security measures in place.