Machine learning has grown to be one of the most popular and powerful tools in the quest to secure systems. Some approaches to machine learning have yielded overly aggressive models that demonstrate remarkable predictive accuracy, yet give way to false positives. False positives create negative user experiences that prevent new protection from deploying. IT personnel also find these false alarms disruptive when they are working to detect and eliminate malware.

The Ponemon Institute recently reported that over 20 percent of endpoint security investigation spending was wasted on these false alarms. IBM’s Ronan Murphy and Martin Borrett also noted that one of Watson’s critical goals is to present security issues to researchers without “drowning them in false alarms.”

Why Are Some Machine Learning Approaches So Prone to False Positives?

Machine learning works to draw relationships between different elements of data. To provide endpoint security solutions, most models search for features that can provide the most context about malware threats. In other words, the models are trained to recognize good software and bad software in order to block the bad.

Many newer solutions on the market aim to identify a wider variety of malicious code than existing products to highlight the need for more protection. However, when models are trained with a bias toward identifying malware, they are more likely to lump good software in with the bad, and thus create false positives.

This imbalance becomes more pronounced due to how challenging it is to capture a representative sample of good software, particularly custom software. New tools have made it simpler and faster for organizations to create or combine more of their own applications, and many business applications are developed for a specific use at a specific firm. So, while gathering tens of thousands of malware samples is straightforward and represents threats common to all organizations, gathering a similar quantity of good software means acquiring information about well-known and packaged applications. This causes training models to recognize the differences between malware and common packaged software, yet ignore the profile of custom or lesser-known applications that may also be present.

Assessing the Business Impact

We’ve already talked about alert fatigue and the wasted investment of tracking down false positive results. These impacts, though, are mainly felt by the IT or security group. The real damage is caused by the effect on the individual users: When a preventative solution thinks it sees malicious code, it stops it from running. If there is a false positive, this means that users cannot run an application that they need for their job.

According to a Barkly survey of IT administrators, 42 percent of companies believe that their users lost productivity as a result of false positive results. This creates a choke point for IT and security administrators in the business life cycle. To manage false positives, companies should create new processes to minimize their duration and recurrence.

In some cases, the process of recognizing, repairing and avoiding false positives can take on a life of its own. In even a midsized organization, the volume of different software packages can run into the hundreds. If each package is only updated once a year, then every day could hold multiple new executables that could result in potential false positives. Companies will then have to allocate budgets for whitelisting or exception creation.

Designing a Better Approach

A critical component of modern machine learning is its ability to quickly gather insight from new data and adapt. Considering how biases lead to false positives, it is clear that models will need to be sensitive to the particular software profile of each organization.

In the same way that machine learning can be a groundbreaking technology to recognize new malware, it can also be used to train against a company’s newest software. The best body of good software to train with resides within the organization. By training against both the broadest samples of malware and the most relevant samples of good software, the models can deliver the best protection with the highest accuracy — and lowest false positive rate.

Achieving Balance

The Barkly survey revealed that IT professionals began to doubt the urgency of alerts once they saw more false positives than valid alerts. To provide maximum value while reducing the pressure on overworked staff, security based on machine learning must balance blocking malicious software with avoiding impact on the regular use of business applications. This requires a robust understanding of an organization’s good software, in addition to identifying and training on malicious software. In the end, the result is the true security value that thoughtful machine learning can bring.

Read the case study: Sogeti Realizes 50% Faster Analysis Times With Watson for Cyber Security


More from Intelligence & Analytics

Despite Tech Layoffs, Cybersecurity Positions are Hiring

4 min read - It’s easy to read today’s headlines and think that now isn’t the best time to look for a job in the tech industry. However, that’s not necessarily true. When you read deeper into the stories and numbers, cybersecurity positions are still very much in demand. Cybersecurity professionals are landing jobs every day, and IT professionals from other roles may be able to transfer their skills into cybersecurity relatively easily. As cybersecurity continues to remain a top business priority, organizations will…

4 min read

79% of Cyber Pros Make Decisions Without Threat Intelligence

4 min read - In a recent report, 79% of security pros say they make decisions without adversary insights “at least the majority of the time.” Why aren’t companies effectively leveraging threat intelligence? And does the C-Suite know this is going on? It’s not unusual for attackers to stay concealed within an organization’s computer systems for extended periods of time. And if their methods and behavioral patterns are unfamiliar, they can cause significant harm before the security team even realizes a breach has occurred.…

4 min read

Why People Skills Matter as Much as Industry Experience

4 min read - As the project manager at a large tech company, I always went to Jim when I needed help. While others on my team had more technical expertise, Jim was easy to work with. He explained technical concepts in a way anyone could understand and patiently answered my seemingly endless questions. We spent many hours collaborating and brainstorming ideas about product features as well as new processes for the team. But Jim was especially valuable when I needed help with other…

4 min read

Ex-Conti and FIN7 Actors Collaborate with New Backdoor

15 min read -   April 27, 2023 Update This article is being republished with modifications from the original that was published on April 14, 2023, to change the name of the family of malware from Domino to Minodo. This is being done to avoid any possible confusion with the HCL Domino brand. The family of malware that is described in this article is unrelated to, does not impact, nor uses HCL Domino or any of its components in any way. The malware is…

15 min read