Machine learning (ML) has brought us self-driving cars, machine vision, speech recognition, biometric authentication and the ability to unlock the human genome. But it has also given attackers a variety of new attack surfaces and ways to wreak havoc.

Machine learning applications are unlike those that came before them, making it all the more important to understand their risks. What are the potential consequences of an attack on a model that controls networks of connected autonomous vehicles or coordinates access controls for hospital staff? The results of a compromised model can be catastrophic in these scenarios, but there are also more prosaic threats to consider, such as fooling biometric security controls into granting access to unauthorized users.

Machine learning is still in its early stages of development, and the attack vectors are not yet clear. Cyberdefense strategies are also in their nascent stages. While we can’t prevent all forms of attacks, understanding why they occur helps us narrow down our response strategies.

A Structured Approach to Machine Learning Security

Threat modeling is a security optimization process that applies a structured approach to identifying and addressing threats. Machine learning security threat modeling does the same thing for ML models. It’s used at the early stages of building and deploying ML models to identify all possible threats and attack vectors.

There are four fundamental questions to ask.

Who Are the Threat Actors?

Threat actors can range from nation-states to hacktivists to rogue employees. Each category of potential adversaries has different characteristics that require different defense/response strategies. Their reasons for attacking also vary, which is why the “why” and “what” questions described below are so critical.

Why Are They Threats and What Are Their Motivations?

A wide range of factors can influence attackers to target ML systems. Defense strategies should proceed from the CIA triad, which is a three-sided information security governance model that encompasses confidentiality, integrity and availability:

  • Confidentiality is about ensuring that only those with appropriate rights can access information. These protections can guard against an attacker who wants to extract sensitive data by compromising training data.
  • An integrity attack might attempt to influence the model’s behavior, such as returning false positives in a facial recognition system. Protections such as frequent backups, digital signatures and audits ensure that information isn’t altered or tampered with.
  • An availability attack may be aimed at reducing the consistency, performance or access to the machine learning model. Good availability practices, such as maintaining redundant servers and applying data loss prevention tools, make information available when needed.

How Will They Attack?

Machine learning systems open new avenues for attacks that don’t exist in conventional procedural programs. One of these is the evasion or adversarial attack, in which a foe attempts to inject inputs to ML models that are intentionally meant to trigger mistakes. The data may look okay to humans, but subtle variances can cause ML algorithms to go wildly off track.

Such attacks may occur at inference time by exploiting the model’s internal information, typically in one of two ways. In a white box attack, the attacker has some information about the model, obtained either directly or from untrusted actors in the data processing pipeline. In a black box scenario, the assailant knows nothing about the system’s internal workings, but identifies vulnerabilities by repeatedly probing and finding patterns in the results that betray the learning model.

New Threat Vectors

There are two dimensions we can use to classify the “how” of an ML attack: inference and training. In an attack at the inference phase, the enemy has specific information about the model and/or the data used to train it. It isn’t necessary to have direct access to the system to obtain this information. Exploratory techniques, such as side-channel and remote attacks, can permit an adversary to penetrate deployed ML systems by inferring their logic through responses to inputs or by using data poisoning so the attacker can target the hardware directly.

Attacks at the training phase attempt to learn and corrupt the model. Depending on the availability of data, the attacker may use substitute models to test potential inputs before submitting them to the victim.

There are also two ways to alter the model. The injection method modifies existing data by inserting untrusted components, causing the results of the model to become untrustworthy. A particularly dangerous alternative approach is logic corruption, in which the attacker changes the learning algorithm itself. This tactic is extremely dangerous because intruders can effectively take control of systems and direct them to produce whatever output they want.

Machine Learning Attacks

Putting all of these factors together, we can identify three distinct attacks that target different phases of the machine learning life cycle:

  1. Evasion attacks — These attacks are the most common. Typically performed during inference time, evasion attacks are intended to introduce inputs that cause the model to produce incorrect results.
  2. Poisoning attacks — These attacks are carried out during the inference stage and are intended to threaten integrity and availability. Poisoning alters training data sets by inserting, removing or editing decision points to change the boundaries of the target model.
  3. Privacy attacks — These usually happen during the training phase. The intent is not to corrupt the training model, but to retrieve sensitive information.

In addition to the above, there are several attacks that may occur at either or both of the training and inference stages. They go by names such as anchor points, mimicry, model extraction, path finding, least cost, and constrained and gradient descent.

Unfortunately, we can expect new attack types to emerge as ML goes mainstream, but understanding the basic vulnerabilities and prevention tactics is the first step toward combating them.

More from Artificial Intelligence

Machine Learning Applications in the Cybersecurity Space

3 min read - Machine learning is one of the hottest areas in data science. This subset of artificial intelligence allows a system to learn from data and make accurate predictions, identify anomalies or make recommendations using different techniques. Machine learning techniques extract information from vast amounts of data and transform it into valuable business knowledge. While most industries use these techniques, they are especially prominent in the finance, marketing, healthcare, retail and cybersecurity sectors. Machine learning can also address new cyber threats. There…

3 min read

Now Social Engineering Attackers Have AI. Do You? 

4 min read - Everybody in tech is talking about ChatGPT, the AI-based chatbot from Open AI that writes convincing prose and usable code. The trouble is malicious cyber attackers can use generative AI tools like ChatGPT to craft convincing prose and usable code just like everybody else. How does this powerful new category of tools affect the ability of criminals to launch cyberattacks, including social engineering attacks? When Every Social Engineering Attack Uses Perfect English ChatGPT is a public tool based on a…

4 min read

Can Large Language Models Boost Your Security Posture?

4 min read - The threat landscape is expanding, and regulatory requirements are multiplying. For the enterprise, the challenges just to keep up are only mounting. In addition, there’s the cybersecurity skills gap. According to the (ISC)2 2022 Cybersecurity Workforce Study, the global cybersecurity workforce gap has increased by 26.2%, which means 3.4 million more workers are needed to help protect data and prevent threats. Leveraging AI-based tools is unquestionably necessary for modern organizations. But how far can tools like ChatGPT take us with…

4 min read

Why Robot Vacuums Have Cameras (and What to Know About Them)

4 min read - Robot vacuum cleaner products are by far the largest category of consumer robots. They roll around on floors, hoovering up dust and dirt so we don’t have to, all while avoiding obstacles. The industry leader, iRobot, has been cleaning up the robot vacuum market for two decades. Over this time, the company has steadily gained fans and a sterling reputation, including around security and privacy. And then, something shocking happened. Someone posted on Facebook a picture of a woman sitting…

4 min read