The year 2020 — with all its tumult — ushered in a massive shift in the way most companies work. Much of that transformation included migrating to cloud, with some statisticians reporting that a full 50% of companies across the globe are now using cloud technology.

In many ways, that’s good — cloud holds several advantages for organizations, not the least of which are freeing up resources (both financial and human) for deployment in other areas. Cloud has made life easier and more convenient for consumers, too, which for business often translates into increased customer satisfaction.

But with benefit often comes risk, and cloud is not immune. In fact, Gartner reports that “Through 2025, 99% of cloud security failures will be the customer’s fault.” That’s a sobering statistic and reinforces the need to not only have a robust proactive security strategy, but a reactive incident response (IR) plan as well. Without an offensive and defensive approach working in tandem, the result can lead to significant repercussions to brand reputation, financials, as well as increased legal exposure and obligations.

Hearken back a few short years ago when a major financial institution experienced a cloud breach. The incident exposed 80,000 bank account numbers and over one million government ID numbers. The company — like many — moved quickly to the cloud, and while celerity may have had little to do with the actual breach, it nonetheless drives home the complexity of mitigating an incident in the cloud — and doing it in a way that can save time, money and reputation.

Let’s peel back the layers of cloud itself, and then take a look at how incident response teams must behave differently to thwart breaches in a more complex cloud environment. For IR teams, this includes adding more stakeholders into the plan, including security architects, cloud security engineers as well as third parties, and, most importantly, your cloud service providers.

Shared Responsibility Model for Cloud Environments

In a traditional incident response framework, the organization is responsible for its systems and data. However, in cloud environments, the cloud service customer (CSC) is not the owner of all systems, and depending on the adopted service model, the cloud service provider (CSP) manages specific areas of the environment.

It’s important to understand the shared responsibility model that’s employed between the CSC and CSP. It requires CSCs to work together with the vendor toward achieving security objectives. Customers need to build and align their IR processes and procedures based on this model. During an incident, it’s critical to understand those responsibilities and know when and how to engage with a CSP. Depending on the organization’s capabilities, specialized third parties to support the investigation and remediation activities may also be needed.

Shared responsibilities differ between the different types of cloud services:

  • Software as a Service (SaaS)
  • Platform as a Service (PaaS)
  • Infrastructure as a Service (IaaS)

Based on these services, the different responsibilities between CSC and CSP are listed below:

Figure 1: Cloud Incident (CIR) Framework (Source: Cloud Security Alliance)

Cloud incidents can also occur within the different layers of the cloud environment, also referred to as domains, which typically fall within the CSC responsibilities. It’s important to understand and prepare for these, as the risk and impact, as well as the response activities, can differ greatly. The following domains should be considered:

  • Service Domain — Incidents in the service domain can potentially put the entire cloud environment at risk and may affect cloud accounts, IAM permissions and more. Investigations are typically limited to service and authentication logs.
  • Infrastructure Domain — Incidents in the infrastructure domain may affect networks, services, VMs or instances including containers and more. The response to infrastructure domain incidents often involves retrieval, restoration or acquisition of incident-related data for forensics. However, it is important to understand that not all services are hosted on underlying operating systems that can be analyzed, which may shift the analysis to logs as available.
  • Application Domain — Incidents in the application domain occur in the application code or software deployed within cloud services. Depending on the level of compromise, this may incorporate similar responses to those in the infrastructure domain. With appropriate and thoughtful planning, a response to an application domain incident could also be managed with cloud tools, using automated forensic, recovery and deployment.

It’s recommended to use a risk management framework to establish policies and control objectives to manage the risk of the different cloud domains.

Plan and Prepare for Incidents in Cloud Environments

To effectively prepare for incidents in cloud environments, organizations should consider starting to target low-hanging fruit and extend their existing IR capabilities.

If properly done, the cloud can allow for a fast and effective incident response. Resources can be re-deployed, account permissions changed, and data can be shared through rules rather than by physically moving bits. However, a fruitful cloud incident response strategy requires several considerations.


During incident response engagements, we often see that cloud environments are a blind spot for incident response teams. Their traditional responsibilities include responding to incidents in on-prem environments using traditional tools and processes. Cloud environments are commonly an area for cloud architects, engineers and application developers. It’s essential that the IR team is familiar with the cloud domain, its terminologies and receives training to gain the required skills.

Furthermore, cloud accounts should be available to the IR team with pre-defined roles. For example, a responder role can be established to access an incident responder only environment during incidents where artifacts and evidence can be moved to, stored and analyzed. The level of access needs to allow IR stakeholders to perform their tasks and should be tested regularly.

Continuous tests and exercises are also an important part to ensure people are trained and processes tested to hold up during real incident scenarios. Depending on the objectives, this can range from small exercises for testing specific processes such as data acquisition to larger tabletop exercises that bring together the extended incident response and crisis management teams. This is oftentimes helpful for creating awareness and identifying potential gaps outside of the security organization. Furthermore, technical simulation exercises can provide a structured opportunity for testing detection and monitoring tools as well as response processes.


The commonly accepted incident response lifecycle described by NIST (preparation, detection and analysis, containment, eradication and recovery, and post-mortem and incident management) still holds true for cloud environments. However, certain processes are different when responding to cloud-based incidents and require adequate documentation and testing.

Update policies and communication plans to include main resources and stakeholders from your existing cloud environments. Furthermore, create a process for employees as well as third parties to effectively report incidents.

For a coordinated tactical response, traditional IR plans need to account for additional roles such as cloud architects and engineers as well as CSPs. Additionally, to the shared responsibilities, it’s also important to understand and document service level agreements with third parties to assure expectations can be met. There will also be a need to document cloud-based assets and the data stored and processed within, to establish adequate incident escalation and notification requirements.

Revise your process for following chain of custody and include cloud-based artifacts. This includes establishing processes, identifying resources, and assigning roles and responsibilities. Naturally, this process should be tested ahead of time to avoid mistakes and delays during real events. Mistakes can have severe consequences such as rendering evidence inadmissible in court in the event of a prosecution.


Forensic tools and workstations that can be spun up to analyze systems and applications in cloud environments should be prepared and tested ahead of time. Best practices recommend to “launch resources near events,” which include performing tasks such as isolation, data acquisition and analysis in the cloud. This allows for easier and faster transferring of data, creating images and effectively sharing resources, as opposed to performing analysis offline. However, there should also be alternative options in case a security event impacts the integrity of the cloud environment. Furthermore, forensic tools must be kept up-to-date and remain compatible with changing cloud environments.

Visibility is key and incident responders rely on logs and artifacts that are available to them. CSPs offer a variety of tools and features for monitoring and incident detection. Some organizations keep them in the cloud, others ship those to a centralized logging solution. It’s critical that incident responders have timely access to those resources. Furthermore, it’s important to understand the type of data that is available, where it is stored and for how long. This also helps to learn about potential visibility gaps and overcome these.

While the IR team may be skilled in traditional forensics of operating systems and network environments, additional considerations must be taken for investigating cloud-based assets. Serverless computing, the complexity of modern web applications and cloud provider specific features may require adjustments to the selection of existing incident response tools. Furthermore, tools and features evolve constantly. Therefore, leveraging optimization and automation opportunities that arise to expedite and improve the overall efficiency of the incident response process can be crucial.

Get Your Incident Response Teams Some Cloud Cover

IBM Security X-Force Incident Response Services for Cloud bring together IR and remediation services, breach response training and threat intelligence — all designed to work in harmony to help you prepare for or respond to a cloud-based cyber event.

You can also access proactive cloud services such as maturity assessments, development of IR plans and playbooks, and training simulation exercises.

Learn more by scheduling a no-cost consultation here, and find more resources below:

More from Security Services

How I got started: SIEM engineer

3 min read - As careers in cybersecurity become increasingly more specialized, Security Information and Event Management (SIEM) engineers are playing a more prominent role. These professionals are like forensic specialists but are also on the front lines protecting sensitive information from the relentless onslaught of cyber threats. SIEM engineers meticulously monitor, analyze and manage security events and incidents within an organization. They leverage SIEM tools to aggregate and correlate data, enabling them to detect anomalies, identify potential threats and respond swiftly to security…

How IBM secures the U.S. Open

2 min read - More than 15 million tennis fans around the world visited the US Open app and website this year, checking scores, poring over statistics and watching highlights from hundreds of matches over the two weeks of the tournament. To help develop this world-class digital experience, IBM Consulting worked closely with the USTA, developing powerful generative AI models that transform tennis data into insights and original content. Using IBM watsonx, a next-generation AI and data platform, the team built and managed the entire…

Machine learning operations can revolutionize cybersecurity

4 min read - Machine learning operations (MLOps) refers to the practices and tools employed to streamline the deployment, management and monitoring of machine learning models in production environments. While MLOps is commonly associated with data science and machine learning workflows, its integration with cybersecurity brings new capabilities to detect and respond to threats in real-time. It involves streamlining the deployment and management of machine learning models, enabling organizations to gain insight from vast amounts of data and improve their overall security posture. Defining…

Zero-day attacks are on the rise. Can patches keep up?

4 min read - That latest cyberattack threatening your organization is likely coming from outside the corporate network. According to Mandiant’s M-Trends 2023 report, 63% of breaches came from an outside entity — a considerable rise from 47% the year before. When it comes to how intruders are getting into the network, it depends on the organization’s location. Spearphishing is the top attack vector in Europe, while credential theft-based attacks are the number one type of attack in Asia, Kevin Mandia, Mandiant CEO, told…