Best Practices on Securing your AI deployment

Authors

Anshul Garg

Product Marketing Manager

Cloud Pak for Security, IBM

As organizations embrace generative AI, there are a host of benefits that they are expecting from these projects—from efficiency and productivity gains to improved speed of business to more innovation in products and services. However, one factor that forms a critical part of this AI innovation is trust. Trustworthy AI relies on understanding how the AI works and how it makes decisions.

According to a survey of C-suite executives from the IBM Institute for Business Value, 82% of respondents say secure and trustworthy AI is essential to the success of their business, yet only 24% of current generative AI projects are being secured. This leaves a staggering gap in securing known AI projects. Add to this, the ‘Shadow AI’ present within the organizations, it makes the security gap for AI even more sizable.

Challenges to securing AI deployment

Organizations have a whole new pipeline of projects being built that leverage generative AI. During the data collection and handling phase, you need to collect huge volumes of data to feed the model and you’re providing access to several different people, including data scientists, engineers, developers and others. This inherently presents a risk by centralizing all that data in one place and giving many people access to it. This means that generative AI is a new type of data store that can create new data based on existing organizational data. Whether you trained the model, fine-tuned it, or connected it to a RAG (Vector DB), that data likely has PII, privacy concerns and other sensitive information in it. This mound of sensitive data is a blinking red target that attackers are going to try and get access to.

Within model development, new applications are being built in a brand-new way with new vulnerabilities that become new entry points that attackers will try to exploit. Development often starts with data science teams downloading and repurposing pre-trained open-source machine learning models from online model repositories such as HuggingFace or TensorFlow Hub. Open-source model-sharing repositories have been born out of inherent data science complexity, practitioner shortage, and the value they provide to organizations in dramatically reducing the time and effort required for generative AI adoption. However, such repositories can lack comprehensive security controls, which ultimately pass the risk on to the enterprise—and attackers are counting on it. They can inject a backdoor or malware into one of these models and upload the infected model back into the model-sharing repositories, affecting anyone who downloads it. The general scarcity of security around ML models, coupled with the increasingly sensitive data that ML models are exposed to, means that attacks targeting these models have a high propensity for damage.

And during inferencing and live use, attackers can manipulate prompts to jailbreak guardrails and coax models into misbehaving by generating disallowed responses to harmful prompts including biased, false and other toxic information, inflicting reputational damage. Or, attackers can manipulate the model and analyze input-output pairs to train a surrogate model to mimic the behavior of the target model, effectively “stealing” its capabilities, costing that enterprise its competitive advantage.

Explore AI security solutions

Industry newsletter

The latest tech news, backed by expert insights

Stay up to date on the most important—and intriguing—industry trends on AI, automation, data and beyond with the Think newsletter. See the IBM Privacy Statement.

Critical steps to securing AI

Different organizations are using different approaches to securing AI as the standards and frameworks for securing AI evolve. IBM’s framework for securing AI revolves around securing the key tenets of an AI deployment—securing the data, securing the model and securing the usage. In addition, you need to secure the infrastructure on which the AI models are being built and run. And they need to establish AI governance and monitor for fairness, bias and drift over time—all in a continuous manner to keep track of any changes or model drift.

Securing the data: Organizations will need to centralize and collate massive amounts of data to get the most out of gen AI and max out its value. Whenever you start combining and centralizing your crown jewels in one place, you expose yourself to significant risk, so you need to have a data security plan to identify and protect sensitive data.
Securing the model: Many organizations are downloading models from open sources to accelerate development efforts. Data scientists are downloading these black box models with no visibility into how they work. Attackers have the same access to these online model repositories and can deploy a backdoor or malware into one of these models and upload them back into the repository as an entry point to anyone who downloads the infected model. You need to understand the vulnerabilities and misconfigurations in the deployment.
Secure the usage: Organizations need to ensure safe usage of AI deployment. Threat actors could execute a prompt injection where they use malicious prompts to jailbreak models, get unwarranted access, steal sensitive data or bias outputs. Attackers can also craft inputs to collect model outputs, accumulating a large dataset of input-output pairs to train a surrogate model to mimic the behavior of the target model, effectively “stealing” its capabilities. You need to understand the usage of the model and map it with assessment frameworks to ensure safe usage.

And all this needs to be done while maintaining regulatory compliance.

Mixture of Experts | 4 July, episode 62

Decoding AI: Weekly News Roundup

Join our world-class panel of engineers, researchers, product leaders and more as they cut through the AI noise to bring you the latest in AI news and insights.

Watch the latest podcast episodes

Introducing IBM Guardium AI Security

As organizations work with existing threats and the growing cost of data breaches, securing AI will be a big initiative—and one where many organizations will need support. To help organizations use secure and trustworthy AI, IBM has launched IBM Guardium AI Security. Building on decades of experience in data security with IBM Guardium, this new offering allows organizations to secure their AI deployment.

It allows you to manage security risk and vulnerabilities of sensitive AI data and AI models. It helps you identify and fix vulnerabilities in the AI model and protect sensitive data. Continuously monitor for AI misconfiguration, detect data leakage and optimize access control—with a trusted leader in data security.

Part of this new offering is the IBM Guardium Data Security Center, which empowers security and AI teams to collaborate across the organization through integrated workflows, a common view of data assets and centralized compliance policies.

Securing AI is a journey and requires collaboration across cross-functional teams—security teams, risk and compliance teams, and the AI teams—and organizations need to work through a programmatic approach to secure their AI deployment.

See how Guardium AI Security can help your organization, and sign up for our webinar to learn more.

Best practices on securing your AI deployment