The chief information security officer (CISO) faces threats such as compromised users, negligent employees and malicious insiders. For this reason, one of the most important tools in the CISO’s arsenal is user behavior analytics (UBA), a solution that scans data from a security information and event management (SIEM) system, correlates it by user and builds a serialized timeline.

How UBA Works

Machine learning models build baselines of normal behavior for each user by looking at historical activity and comparing it to peer groups. Any abnormal events detected are aggregated through a scoring mechanism that generates a combined risk score for each user. Alerts from other security tools can be used in this process as well.

Users at high risk are flagged with information such as job title, department, manager and group membership to enable analysts to quickly investigate that particular user’s behavior in the context of his or her role within the organization. By combining all of a user’s data from disparate systems and utilizing artificial intelligence (AI) to gain insights, UBA empowers analysts with new threat hunting capabilities.

This technology is not new, but its application is new in the security environment. Many endpoint products offered today are cloud-based to provide seamless mobile device protection outside the organization. Given the evolving attack landscape and the new challenges faced by security teams, the application is growing rapidly, and it is quickly becoming the best practice for enterprise security teams.

Machine learning technology uses techniques that harness AI to learn and make judgments without being programmed explicitly for every scenario. It is different from static, signature-based products such as SIEM because it learns from data. The technology is capable of providing a probabilistic conclusion, which can then be converted into a binary signal. The likelihood of a decision being accurate can be interpreted as a measure of confidence in that conclusion. Security analysts can also validate these conclusions and investigate others that fall into gray areas.

The mathematic algorithms are complex and computer resource-intensive. Since there is no single model that applies to every attack technique, the selection of the model and data is crucial. This is one reason why these new, evolving endpoint products are based in the cloud and conceivably draw upon data globally from every industry.

Establishing a Behavioral Baseline

Among the advantages of this technology is the ability to quickly and easily distinguish anomalous events from malicious events. Employees change jobs, locations and work habits all the time. Machine learning alleviates the overwhelming volume of false positives and provides the behavioral baseline DNA of each user.

Machine learning also enables analysts to interpret subtle signals. Behavioral analytics can flag most attacks that pace themselves and act in small steps, but attackers know that analysts have tools to find telltale attack signatures. For instance, SIEM correlation rules that look for the signature attack behavior can be easily bypassed by signature deviation. A correlation rule may look for five failed logins in one minute as an indicator of an abnormal access attempt. An attacker could bypass the rule by deviating the attempt one second after a minute elapsed.

Finally, analysts can use machine learning to gain insights beyond individual events. Cyberattacks that have already infiltrated the network might slowly follow the kill chain of reconnaissance, infiltration, spread and detonation. AI pieces together the whole picture to make decisions and aid in incident response.

Evaluating Machine Learning Solutions

There is a lot of marketing noise associated with machine learning technology. Below are some useful approaches to evaluating AI-enabled security solutions.

  • Use case definitions: Determine what you want out of the solution and tailor it toward specifics such as spear phishing attacks, privileged users, malware, etc. This will help formulate a short list of solutions you’re targeting.
  • Pick organizational subsets: Scaling is often a consideration, but for a proof of concept (PoC), consider establishing a small group to evaluate two or three vendors.
  • Get source access: These solutions will need access to certain infrastructure, such as active directory log files, to operate. Ensure that the solution has all the appropriate access privileges it needs to function.
  • Understand the results: Machine learning solutions deliver probabilistic results based on a percentage. The solution must provide supporting evidence when it flags an event so that analysts can act on it.
  • Ensure classification accuracy: Evaluate the number of correct predictions as a ratio of all predictions made. This is the most common metric for classification problems — and also the most misused.
  • Evaluate logarithmic loss: Logarithmic loss is defined as a performance metric for evaluating the predictions of probabilities of membership to a given class. It can be a measure of confidence for a prediction by an algorithm, for example. Predictions that are correct or incorrect are flagged to the confidence of the prediction.
  • Determine who will own it: Common considerations include whether the tool will be a standalone solution or integrated with an SIEM. It can also be part of a security operations center (SOC) with red and blue teams harnessing it or another layer in the architecture where resources are tight.

Augmenting Human Intelligence

Always remember that these technologies are not silver bullets. Buyers of enterprise security products need to educate themselves on the basics of these technologies to avoid succumbing to the hype. Two standard deviations from the mean do not constitute machine learning, and five failed logins in one minute do not constitute artificial intelligence. In the absence of other information, there is no predictive value in seeing, for example, that an employee visited a website based in Russia.

These solutions provide a probability that a certain conclusion is accurate depending on its algorithm model. The real outcome is somewhere in the middle. Despite the hype surrounding artificial intelligence, all it does is provide mathematical suspicions, not confirmations. To maximize the effectiveness of artificial intelligence for cybersecurity, machine learning must be paired with savvy security analysts.

Read the white paper: Cybersecurity in the cognitive era

More from Artificial Intelligence

Could a threat actor socially engineer ChatGPT?

3 min read - As the one-year anniversary of ChatGPT approaches, cybersecurity analysts are still exploring their options. One primary goal is to understand how generative AI can help solve security problems while also looking out for ways threat actors can use the technology. There is some thought that AI, specifically large language models (LLMs), will be the equalizer that cybersecurity teams have been looking for: the learning curve is similar for analysts and threat actors, and because generative AI relies on the data…

AI vs. human deceit: Unravelling the new age of phishing tactics

7 min read - Attackers seem to innovate nearly as fast as technology develops. Day by day, both technology and threats surge forward. Now, as we enter the AI era, machines not only mimic human behavior but also permeate nearly every facet of our lives. Yet, despite the mounting anxiety about AI’s implications, the full extent of its potential misuse by attackers is largely unknown. To better understand how attackers can capitalize on generative AI, we conducted a research project that sheds light on…

C-suite weighs in on generative AI and security

3 min read - Generative AI (GenAI) is poised to deliver significant benefits to enterprises and their ability to readily respond to and effectively defend against cyber threats. But AI that is not itself secured may introduce a whole new set of threats to businesses. Today IBM’s Institute for Business Value published “The CEO's guide to generative AI: Cybersecurity," part of a larger series providing guidance for senior leaders planning to adopt generative AI models and tools. The materials highlight key considerations for CEOs…

Does your security program suffer from piecemeal detection and response?

4 min read - Piecemeal Detection and Response (PDR) can manifest in various ways. The most common symptoms of PDR include: Multiple security information and event management (SIEM) tools (e.g., one on-premise and one in the cloud) Spending too much time or energy on integrating detection systems An underperforming security orchestration, automation and response (SOAR) system Only capable of taking automated responses on the endpoint Anomaly detection in silos (e.g., network separate from identity) If any of these symptoms resonate with your organization, it's…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today