Become a JSON Formatter — and Kick Your Security Integrations Into Action

Any woodworker will tell you that finding the right piece of wood, chopping it into perfectly sized chunks and then crafting those chunks into useful items is a complicated process. If a woodworker is making a set of chairs, for example, custom work will be needed for every piece because every tree is different. Incredible skill is needed to ensure consistency and quality craftsmanship.

Information security monitoring works the same way, more or less. Becoming a JavaScript Object Notation (JSON) formatter — and packing up your data in a new way — can reduce or eliminate the need for custom work. Being a JSON formatter can also save your security team precious time.

Logging Inefficiencies

Getting a non-standard log source into your security monitoring tool typically follows a set process. First, you find your non-standard log source and figure out how it logs and how to send those logs across your network. Next, you get those logs sending to your security monitoring tool. (Chances are they’re going via Syslog, so we’ll stick with this assumption for simplicity.) Last, you figure out how to parse out arbitrarily worded and assembled log fields using regular expressions (regex). (Depending on log complexity, this can take a while.)

There’s a massive headache here because of the inefficiency the discrepancies between different custom log sources introduce. The log’s format is essentially arbitrary and non-standardized, created at the mercurial whims of individual developers. Due to this, each time you write a custom parser using regex, it’s going to be a different and potentially substantial piece of work that will require dedicated effort from security integration experts.

JSON Fundamentals

Fortunately, there’s another way. JSON is self-describing, easy to understand and language independent.

  • JSON: This is a lightweight data interchange format that uses human-readable text to transmit data objects consisting of attribute-value pairs. Attribute-value pairs look something like this: {Date: 10/10/2018, DayofTheWeek: Monday, RequiredFood: Lasagna}. In other words? JSON is a simple, consistent syntax for exchanging and storing data.
  • JavaScript Object: This is an abstract data type composed of a collection of pairs so that each possible key appears no more than once in the collection.
  • Extensible Markup Language (XML): This is an HTML-like language that defines a set of rules for encoding documents in a format that is both human- and machine- readable. It’s a comparator to JSON in some ways.

Since the JSON format is text-only, it can easily be sent to and from a server and used as a data format by any programming language. JavaScript can convert JSON format into native JavaScript objects, which makes it easy to use JSON data like any other JavaScript object. JSON is a low-overhead alternative to XML — as JSON does not use end tags (like with HTML, these are the (end/) bits that close off elements) — which makes it shorter and quicker to read and write.

The case for JSON seems pretty compelling. However, its greatest asset (i.e., its flexible nature) also makes it situationally less suitable than XML for transferring data between separate systems or storing data that will be read by third parties. This is because it lacks some key features — in particular, schema support — which can be used to verify each piece of item content in a document. This can cause issues when moving data between systems.

The Need for Simpler Logging

The growing number of devices and application types being implemented in enterprise networks are making it increasingly difficult for log parsers to keep up with the different log formats and data values that can arise — especially from niche or proprietary custom software. Because of this, manual configuration is often needed to enable the parsing and normalizing (consistent value extraction) of logs with uncommon attributes.

When custom configuration is needed, the log’s syntax can have a significant effect on the effort required to work with them. This can include:

  • Attempting to understand the meaning of unclear values;
  • Writing countless, complex regex strings;
  • Finding a consistent anchor point — so regex remains valid with different log message types;
  • Managing the extra burden on system resources if regex is not perfectly written.

Putting JSON to Work

JSON dramatically reduces the effort needed to design and parse logs, both concerning configuration and computational resources. The key value pair (KVP) format of JSON makes it easier for humans and machines to understand what each value in the logs actually means. The JSON field names can also act as a stable anchor for writing extraction expressions. This feature reduces the likelihood of expensive or unreliable strings.

IBM QRadar is a security information and event management (SIEM) system that has a built-in collection of log parsers. In IBM QRadar, for example, it’s simple to extract log event properties created with a JSON formatter using JSON expressions.

To capture the value, one simply types: /”<field-name>”. The IBM QRadar GUI will highlight what your expression is capturing in the workspace on the right. In this case, we are capturing the sender address, so the JSON expression will be: /”SenderAddress”.

We have only scratched the surface of potential use cases of JSON and the power of becoming a JSON formatter. There are multiple ways to tackle logging outside the standard, arbitrary, one-size-fits-all format regex extraction paradigm. IBM QRadar makes it easy to take this way of working with custom log sources to the next level. Because of this, it’s worth considering this approach the next time you plan an integration.

Contributor'photo

Alexander M. Paterson

Security Intelligence Consultant, IBM

Alexander M. Paterson is a Security Intelligence Consultant for IBM UK. His deep technical and industry specific...