I have had the title of this blog post as the quote in my email footer for a couple of years now. Even after all this time, it still makes me pause for thought.
One of the biggest gaps I see that even large companies have is a lack of data or evidence. This seems to be at odds with the flashy focus cyberdefense currently has on big data. While I think many corporations technically possess the data, it is not available for use. Every day, we create 2.5 quintillion bytes of data, so much that 90 percent of the data in the world today has been created in the past two years alone.
What Comprises Security Data?
The more data you have, the more insight you can gain. This sounds a little cliche, but it is spot-on when it comes to security data. The following is what is classified as security data:
- System Logs: “Craig logged in to system XYZ.”
- Application Logs: “Application XYZ has processed Transaction UVW.”
- Infrastructure Logs: “The firewall has blocked a packet from System XYZ to System UVW.”
Then, you add business context, such as the following:
- GHI is PCI environment.
- JKL is a critical business system.
- This is external.
- This is internal.
- This is an admin user.
- That is a high-transaction system.
- This is my Virtual Private Network.
The Importance of Complete Data Evidence
So, you now have a pool of data that you can enrich with the context and vulnerabilities, network packet flows and current threat trends. This all increased our evidence, so why the absence?
This data isn’t complete enough. One of the biggest issues my clients have is this incomplete data set, and I sympathize with them. In a Fortune 500 company, hundreds of departments run systems you may not be able to access or see, or the workforce may have high areas of churn. Processing data is what computers are made to do, either through real-time correlation, highly structured queries, historical searches or even using trend data or anomaly analysis.
You just need to get the data. It starts with the security data, something that can be driven through an organization with the right executive authority and mindset in place and a structured program or project. The next stage is the business context, which is where the intelligence feeds in through consulting with the business. This can only be done with buy-in from the relevant areas. Once your data set becomes more complete, you get a huge jump in the success of your security tool sets. While you will never have a complete set, getting close to that nirvana is key to minimizing blind spots and data black holes.
What’s Ahead?
What gets me even more excited about this is what the future holds with cognitive computing. Not only can computers find trends and conduct anomaly learning, but they can mimic one of the greatest human traits: putting two and two together. If you see a system with Domain Name System (DNS) logs that gets hundreds of DNS requests to it with firewalls for Port 53 allowed from multiple sources, you can instantly guess it’s likely a DNS server. You can craft rules for computers to understand this, but it still requires human input.
Now, consider what something such as Watson™ could do by looking at the security data and building the business context. This is an amazing proposition. It learns your organization, allowing you to focus your limited resources on reacting to threats. While this situation is not “coming in Q2,” it is getting closer each day. Your systems can already learn trends and fire on above-average activity for certain systems on a per-system basis or with certain credentials.
With complete data, the evidence pool increases with higher confidence, so there is less absence. When that day comes, I will have to find a new insightful quote for my email footer. Until then, I will continue to remember that the absence of evidence is not the evidence of absence.