There’s some confusion in the security industry between the terms “red teaming” and “penetration testing.” This blog post will discuss how the confusion may have come to be and provide a breakdown of some of the differences between the two exercises. It will also discuss which business challenges align with each exercise so readers know which one could better fit their use case.
At X-Force Red, IBM Security’s team of veteran hackers, we have the advantage of having one of the world’s largest offensive security teams — which means we also have our own opinions on industry trends and terminology, which can sometimes lead to lively, impassioned debates in our messaging channels.
While the industry may not reach a consensus on common security testing terminology anytime soon, we can at least define how we use these terms at X-Force Red, especially when it comes to performing adversary simulation, penetration testing and red teaming exercises.
One common way to describe the relationship between network penetration testing and red teaming is to think of network penetration testing as a boxer hitting a punching bag, whereas red teaming is more akin to a boxer working with a sparring partner who will counter-punch.
Before we get to that, however, let’s explore some of the reasons behind this terminology challenge and where the industry is starting to see some positive steps in the right direction.
Red Teaming Vesus Pen Testing
The term and idea of red teaming became popular in the military and intelligence communities in the early 2000s and more recently gained ground in the cybersecurity world. Part of the confusion between pen testing and red teaming is due to people often using the word “red” or term “red team” to describe anyone that executes offensive security testing, and “blue” or “blue team” to describe anyone working on the defensive side. Thus, “red” activity could be seen as any type of offensive security testing.
So, what is red teaming in the context of cybersecurity, and how does it differ from other types of testing provided by offensive security specialists?
Pen Testing
Let’s begin with penetration testing. This type of manual testing is typically conducted independently of a vulnerability assessment and used to help test the effectiveness of an organization’s vulnerability management program and associated controls within a defined scope. Pen tests are used to test whether certain networks, assets, platforms, hardware or applications are vulnerable to an attacker. Penetration tests are not focused on stealth, evasion, or the ability of the blue team to detect and respond, since the blue team is fully aware of the scope of the testing being conducted.
Red Teaming
Similar to scenario-based penetration tests, red team engagements are designed to achieve specific goals, such as gaining access to a sensitive server or business-critical application.
Red teaming projects differ in that they are heavily focused on emulating an advanced threat actor using stealth, subverting established defensive controls and identifying gaps in the organization’s defensive strategy.
The value of this type of engagement can be derived from a better understanding of how an organization detects and responds to real-world attacks.
Red teaming is typically carried out without a company’s blue team knowing in advance that it is being conducted. If, during an engagement, a target company detects a red team’s malicious activity, it responds as if it were a real attack. For example, if a red team’s activity is detected on a compromised system that’s being used to access the target’s internal network, the blue team will likely respond and remove that access, pushing the red team back a step in its progression.
At the end of a red team engagement, the blue team gives the red team any indicators of compromise (IoCs) that were detected during the engagement. This data can then be compared to other data collected during the course of the engagement and incorporated into a report timeline. To help draw value from the exercise, the red team works closely with the blue team to explain its tactics, techniques and procedures (TTPs) and how to better detect and respond to such offensive methods in future incidents.
Red team exercises typically focus on living off the land, relying on existing tools that are already built into the operating system. We strive to only use tooling when the team is confident it can help to evade or bypass endpoint detection and response solutions or avoid common threat hunting queries by dedicated teams focused on finding nefarious activities through PowerShell/Sysmon/event logs.
During a red team engagement, the team is more focused on targeting DevOps and end users and using the least obvious ways to gain the minimum elevated privileges required to achieve its objectives.
More Than Red and Blue
What about those other colors of teaming? You may have also seen the terms “purple teaming” or “control tuning” used. These exercises are focused on spot-testing specific prevention and detection capabilities of your security stack against techniques within the MITRE ATT&CK framework and are conducted live in conjunction with the blue team.
Often, automated open-source tools such as MITRE Caldera, Uber Metta, Atomic Red Team and Endgame RTA are used to automate portions of a purple team or control tuning exercise, allowing organizations to easily benchmark their improvement against repeatable metrics.
Like others in the industry, X-Force Red uses the term “adversary simulation” to cover both purple and red team engagements.
A Growing Need for Standardization
The IT security testing industry is relatively new when compared to other professions, and with that new domain comes some growing pains on the quest to standardize different approaches.
Full disclosure: I am on the board of directors for CREST Americas, an international not-for-profit accreditation and certification body that represents and supports the technical information security industry. IBM is on the CREST list of accredited companies. We believe in this nonprofit’s mission of helping mature the security testing industry and enabling companies to request and receive higher value, standardized and repeatable testing.
Notice the importance of the word “request.” That component is an often-overlooked piece of the security testing puzzle. If companies aren’t empowered to properly understand different types of security testing, define their requirements and request security testing using standard terminology, they can’t make an apples-to-apples comparison of vendors and, more importantly, they may continue to see a wide variation in the quality of testing received, the final output and what they can do with the results in the long run.
The industry itself often uses conflicting terminology, exacerbating the problem. It felt like we were finally getting past the vulnerability assessment versus pen test discussion when the term red teaming started becoming frequently — and often improperly — used. For every highly qualified boutique testing shop or mature testing firm out there, there are many unqualified testing vendors, possibly in part due to the immaturity and lack of standardization in the industry as a whole.
To help address challenges on the procurement side, CREST has made progress working directly with the industry and even created a procurement guide to help educate customers on how to request the appropriate type of testing and how to compare testing providers in an apples-to-apples, clear and concise way.
It may be the case that the security industry itself can only align to a common definition/delivery of red team exercises after the adoption of clear guidelines. But how can this happen?
Light on the Horizon
No single company, certification body or testing provider can force an industry to mature overnight, but some factors can help speed up the process. Government initiatives, for example, have helped the information security industry mature the security testing domain, especially in the financial sector.
Initiatives such as CBEST, the U.K. central bank’s red team framework, the Hong Kong Monetary Authority’s iCast framework, and TIBER EU, a new framework from the European Central Bank for threat intelligence-based ethical red teaming, provide the industry with more mature frameworks and clearer standards for testing engagements. And it’s not just Europe; similar initiatives are in early stages in Asia and North and South America as well.
How to Choose the Right Test for Your Needs
Application security testing and network penetration tests are a good fit for spot checks. For example, let’s say you rolled out a new solution and, as part of your software development life cycle, want to ensure the application has been coded correctly and the underlying web server and associated DMZ have been hardened and configured to best practices.
Alternatively, you may want to know how difficult it would be for an attacker to compromise a user’s computer via phishing and use that access to escalate their privileges on the network, move laterally, access a sensitive file-share or gain control over the environment by compromising a privileged user’s account. This type of penetration testing is extremely important. It helps organizations check for common paths attackers use to compromise applications or the underlying network. Penetration tests are quick, taking 1–3 weeks on average to complete.
Purple team/control tuning exercises are a great way to bridge the divide between penetration testing and more adversarial, tactic-focused red team engagements. These exercises help security teams better understand their security stack’s coverage, educate the blue team on common techniques, and prepare the team ahead of manual red team engagements.
Purple teaming is best suited for testing scripted/tabletop scenarios focused on a specific subset of techniques, but it can’t replicate the less predictable actions and wider scope of an advanced adversary during a red team exercise. Control tuning can be an excellent way to quickly gain a basic understanding of your security stack’s detection capabilities. A downside of control tuning is that the automated frameworks don’t often account for variations in common techniques attackers may use based on their own unique playbooks or TTPs, nor do they reflect the reality of an attacker targeting business-specific targets.
That’s really where red team engagements come into play. They are intended to emulate an advanced adversary’s TTP’s to compromise an organization with a well-architected security stack, evade a mature blue/threat hunting team and achieve business case-specific goals over the course of 1–3 months (depending on scope). Red teaming engagements are focused on understanding an organization’s ability to protect against, detect and respond to an advanced attack.
Penetration testing, red teaming and purple teaming/control tuning play important roles in an organization’s overall security testing program. The key is knowing where in your overall strategy to use them.
Learn more about X-Force Red’s Adversary Simulation and penetration testing services.
Global Head of X-Force Red