The chief information security officer (CISO) faces threats such as compromised users, negligent employees and malicious insiders. For this reason, one of the most important tools in the CISO’s arsenal is user behavior analytics (UBA), a solution that scans data from a security information and event management (SIEM) system, correlates it by user and builds a serialized timeline.

How UBA Works

Machine learning models build baselines of normal behavior for each user by looking at historical activity and comparing it to peer groups. Any abnormal events detected are aggregated through a scoring mechanism that generates a combined risk score for each user. Alerts from other security tools can be used in this process as well.

Users at high risk are flagged with information such as job title, department, manager and group membership to enable analysts to quickly investigate that particular user’s behavior in the context of his or her role within the organization. By combining all of a user’s data from disparate systems and utilizing artificial intelligence (AI) to gain insights, UBA empowers analysts with new threat hunting capabilities.

This technology is not new, but its application is new in the security environment. Many endpoint products offered today are cloud-based to provide seamless mobile device protection outside the organization. Given the evolving attack landscape and the new challenges faced by security teams, the application is growing rapidly, and it is quickly becoming the best practice for enterprise security teams.

Machine learning technology uses techniques that harness AI to learn and make judgments without being programmed explicitly for every scenario. It is different from static, signature-based products such as SIEM because it learns from data. The technology is capable of providing a probabilistic conclusion, which can then be converted into a binary signal. The likelihood of a decision being accurate can be interpreted as a measure of confidence in that conclusion. Security analysts can also validate these conclusions and investigate others that fall into gray areas.

The mathematic algorithms are complex and computer resource-intensive. Since there is no single model that applies to every attack technique, the selection of the model and data is crucial. This is one reason why these new, evolving endpoint products are based in the cloud and conceivably draw upon data globally from every industry.

Establishing a Behavioral Baseline

Among the advantages of this technology is the ability to quickly and easily distinguish anomalous events from malicious events. Employees change jobs, locations and work habits all the time. Machine learning alleviates the overwhelming volume of false positives and provides the behavioral baseline DNA of each user.

Machine learning also enables analysts to interpret subtle signals. Behavioral analytics can flag most attacks that pace themselves and act in small steps, but attackers know that analysts have tools to find telltale attack signatures. For instance, SIEM correlation rules that look for the signature attack behavior can be easily bypassed by signature deviation. A correlation rule may look for five failed logins in one minute as an indicator of an abnormal access attempt. An attacker could bypass the rule by deviating the attempt one second after a minute elapsed.

Finally, analysts can use machine learning to gain insights beyond individual events. Cyberattacks that have already infiltrated the network might slowly follow the kill chain of reconnaissance, infiltration, spread and detonation. AI pieces together the whole picture to make decisions and aid in incident response.

Evaluating Machine Learning Solutions

There is a lot of marketing noise associated with machine learning technology. Below are some useful approaches to evaluating AI-enabled security solutions.

  • Use case definitions: Determine what you want out of the solution and tailor it toward specifics such as spear phishing attacks, privileged users, malware, etc. This will help formulate a short list of solutions you’re targeting.
  • Pick organizational subsets: Scaling is often a consideration, but for a proof of concept (PoC), consider establishing a small group to evaluate two or three vendors.
  • Get source access: These solutions will need access to certain infrastructure, such as active directory log files, to operate. Ensure that the solution has all the appropriate access privileges it needs to function.
  • Understand the results: Machine learning solutions deliver probabilistic results based on a percentage. The solution must provide supporting evidence when it flags an event so that analysts can act on it.
  • Ensure classification accuracy: Evaluate the number of correct predictions as a ratio of all predictions made. This is the most common metric for classification problems — and also the most misused.
  • Evaluate logarithmic loss: Logarithmic loss is defined as a performance metric for evaluating the predictions of probabilities of membership to a given class. It can be a measure of confidence for a prediction by an algorithm, for example. Predictions that are correct or incorrect are flagged to the confidence of the prediction.
  • Determine who will own it: Common considerations include whether the tool will be a standalone solution or integrated with an SIEM. It can also be part of a security operations center (SOC) with red and blue teams harnessing it or another layer in the architecture where resources are tight.

Augmenting Human Intelligence

Always remember that these technologies are not silver bullets. Buyers of enterprise security products need to educate themselves on the basics of these technologies to avoid succumbing to the hype. Two standard deviations from the mean do not constitute machine learning, and five failed logins in one minute do not constitute artificial intelligence. In the absence of other information, there is no predictive value in seeing, for example, that an employee visited a website based in Russia.

These solutions provide a probability that a certain conclusion is accurate depending on its algorithm model. The real outcome is somewhere in the middle. Despite the hype surrounding artificial intelligence, all it does is provide mathematical suspicions, not confirmations. To maximize the effectiveness of artificial intelligence for cybersecurity, machine learning must be paired with savvy security analysts.

Read the white paper: Cybersecurity in the cognitive era

More from Intelligence & Analytics

RansomExx Upgrades to Rust

IBM Security X-Force Threat Researchers have discovered a new variant of the RansomExx ransomware that has been rewritten in the Rust programming language, joining a growing trend of ransomware developers switching to the language. Malware written in Rust often benefits from lower AV detection rates (compared to those written in more common languages) and this may have been the primary reason to use the language. For example, the sample analyzed in this report was not detected as malicious in the…

Moving at the Speed of Business — Challenging Our Assumptions About Cybersecurity

The traditional narrative for cybersecurity has been about limited visibility and operational constraints — not business opportunities. These conversations are grounded in various assumptions, such as limited budgets, scarce resources, skills being at a premium, the attack surface growing, and increased complexity. For years, conventional thinking has been that cybersecurity costs a lot, takes a long time, and is more of a cost center than an enabler of growth. In our upcoming paper, Prosper in the Cyber Economy, published by…

Overcoming Distrust in Information Sharing: What More is There to Do?

As cyber threats increase in frequency and intensity worldwide, it has never been more crucial for governments and private organizations to work together to identify, analyze and combat attacks. Yet while the federal government has strongly supported this model of private-public information sharing, the reality is less than impressive. Many companies feel that intel sharing is too one-sided, as businesses share as much threat intel as governments want but receive very little in return. The question is, have government entities…

Tackling Today’s Attacks and Preparing for Tomorrow’s Threats: A Leader in 2022 Gartner® Magic Quadrant™ for SIEM

Get the latest on IBM Security QRadar SIEM, recognized as a Leader in the 2022 Gartner Magic Quadrant. As I talk to security leaders across the globe, four main themes teams constantly struggle to keep up with are: The ever-evolving and increasing threat landscape Access to and retaining skilled security analysts Learning and managing increasingly complex IT environments and subsequent security tooling The ability to act on the insights from their security tools including security information and event management software…