Organizations tend to end up in cybersecurity news because they failed to detect and/or contain a breach. Breaches are inevitable, but whether or not an organization ends up in the news depends on how quickly and effectively it can detect and respond to a cyber incident.
Beyond the fines, penalties and reputational damage associated with a breach, organizations should keep in mind that today’s adversaries represent a real, advanced and persistent threat. Once threat actors gain a foothold in your infrastructure or network, they will almost certainly try to maintain it.
To successfully protect their organizations, security teams need the full context of what is happening on their network. This means data from certain types of sources should be centrally collected and analyzed, with the goal of being able to extract and deliver actionable information.
What Is Network Flow Data?
One of the most crucial types of information to analyze is network flow data, which has unique properties that provide a solid foundation on which a security framework should be built. Network flow data is extracted — by a network device such as a router — from the sequence of packets observed within an interval between two internet protocol (IP) hosts. The data is then forwarded to a flow collector for analysis.
A unique flow is defined by the combination of the following seven key fields:
- Source IP address
- Destination IP address
- Source port number
- Destination port number
- Layer 3 protocol type
- Type of service (ToS)
- Input logical interface (router or switch interface)
If any one of the packet values for these fields is found to be unique, a new flow record is created. The depth of the extracted information depends on both the device that generates the flow records and the protocol used to export the information, such as NetFlow or IP Flow Information Export (IPFIX). Inspection of the traffic can be performed at different layers of the Open Systems Interconnection (OSI) model — from layer 2 (the data link layer) to layer 7 (the application layer). Each layer that is inspected adds more meaningful and actionable information for a security analyst.
One major difference between log event data and network flow data is that an event, which typically is a log entry, happens at a single point in time and can be altered. A network flow record, in contrast, describes a condition that has a life span, which can last minutes, hours or days, depending on the activities observed within a session, and cannot be altered. For example, a web GET request may pull down multiple files and images in less than a minute, but a user watching a movie on Netflix would have a session that lasts over an hour.
What Makes Network Flow Data So Valuable?
Let’s examine some of the aforementioned properties in greater detail.
Low Deployment Effort
Network flow data requires the least deployment effort because networks aggregate most of the traffic in a few transit points, such as the internet boundary, and the changes made to those transit points are not often prone to configuration mistakes.
Everything Is Connected
From a security perspective, we can assume that most of the devices used by organizations, if not all of them, operate on and interact with a network. Those devices can either be actively controlled by individuals — workstations, mobile devices, etc. — or operated autonomously — servers, security endpoints, etc.
Furthermore, threat actors typically try to remove traces of their attacks by manipulating security and access logs, but they cannot tamper with network flow data.
Reliable Visibility
The data relevant to security investigations is typically collected from two types of sources:
- Logs from endpoints, servers and network devices, using either an agent or remote logging; or
- Network flow data from the network infrastructure.
The issue with logs is that there will always be connected devices from which an organization cannot collect data. Even if security policies mandate that only approved devices may be connected to a network, being able to ensure that unmanaged devices or services have not been inserted into the network by a malicious user is crucial. Furthermore, history has shown that malicious users actively attempt to circumvent host agents and remote logging, making the log data from those hosts unreliable. The most direct source of information about unmanaged devices is the network.
Finally, network flow data is explicitly defined by the protocol, which changes very slowly. This is not the case with log data, where formats are very often poorly documented, tied to specific versions, not standardized and prone to more frequent changes.
Automatically Reduce False Positives
A firewall or access control list (ACL) permit notification does not mean that a successful communication actually took place. On the other hand, network flow data can be used to confirm that a successful communication took place. Being able to issue an alert unless a successful communication took place can dramatically reduce false positives and, therefore, save precious security analyst time.
Moving Beyond Traditional Network Data
Traditional network flow technology was originally designed to provide network administrators with the ability to monitor their network and pinpoint network congestion. More recently, security analysts discovered that network flow data was also useful to help them find network intrusions. However, basic network flow data was never designed to detect the most sophisticated advanced persistent threats (APTs). It does not provide the necessary in-depth visibility, such as the hash of a file transferred over a network or the detected application, as opposed to the port number, to name a few. By lacking this level of visibility, traditional network flow data greatly limits the ability to provide actionable information about a cyber incident.
Given the increasing level of sophistication of attacks, certain communications, such as inbound traffic from the internet, should be further scrutinized and inspected with a purpose-built solution. The solution must be able to perform detailed packet dissection and analysis — at line speed and in passive mode — and deliver extensive and enriched network flow data through a standard protocol such as IPFIX, which defines how to format and transfer IP flow data from an exporter to a collector.
The resulting enriched network flow data can be used to augment the prioritization of relevant alerts. Such data can also accelerate alert-handling research and resolution.
Why You Should Anaylze Network Flow Data?
Network flow data is a crystal ball into your environment because it delivers much-needed and immediate, in-depth visibility. It can also help security teams detect the most sophisticated attacks, which would otherwise be missed if investigation relied solely on log data. By reconciling network flow data with less-reliable log data, organizations can detect attacks more capably and conduct more thorough investigations. The bottom line is that network flow data can help organizations catch some of the most advanced attacks that exist, and it should not be ignored.
Register for the webinar
Security Channels Business Development Leader, IBM
Technical Sales and Solutions Leader in Europe, IBM Security