4 ways to derive actionable value from big data and analytics
Financial firms are not alone in their quest to find scientific thinking when it comes to big data and making value assessments especially when as it applies to client centric applications. They are finding assistance in many forms. For instance, IBM Watson, the computer, is in a pilot program with a leading global bank to help advance customer interactions, and improve and simplify the banking experience. Watson’s ability to consume vast amounts of information to recognize patterns and make informed hypotheses lends itself to assist individuals with making informed decisions.
In another application of big data, an online payments leader wrote and implemented programs to root out money laundering by identifying patterns of successive payments, all of which were close to the reporting limits. The effort was made as the result of a near business collapse from fraud crimes. Other firms have followed suit and adopted the technology.
Leading financial firms are using similar technologies for a variety of purposes, from structuring equity derivatives to reducing loan loss.
Use cases of big data can be categorized to help put it in perspective. Below are just a few.
Fraud and Money Laundering
A recent study by Carnegie Melon validates that the most impactful fraud techniques use low and slow tactics that often go unnoticed by traditional discovery and situational awareness processes and tools. Clearly the demand for alternative technology is beckoning.
In today’s world of continuously available web sites and mobile applications, fraud is eating up financial firms’ resources and revenue. Certified examiners (AACFE) have found that crime costs an estimated five to six percent of a company’s revenue each year. As the sophistication and use of big data and analytics advance, the ability to convert a cornucopia of raw data into early and actionable responses make this an appealing weapon to solve fraud crime.
Crime costs an estimated five to six percent of a company’s revenue each year
For example, big data analytics can analyze a combination of application, user data, transaction details, and historical information about users, accounts and their relationships to one another. In the case of insider fraud, expanding data sources to understand intercompany communications and external social media can build linkages in people to people communication patterns. It also allows analytics of the types and sentiment of the communications enabling technology to see behavior trends that may indicate an employee is upset with the company or his or her management.
Profiling Advanced Targeted Attacks
A second top of mind use case is the ability to defend against the rising occurrence of targeted attacks. The average time to respond to such attacks today is measured in weeks, well past the window where it is useful to apply protections. The subtle qualities of this kind of attack call for new means of intelligence gathering.
There is a defined sequence of events that make up the anatomy of this threat. The targeted attack is usually initiated with spear phishing attempts and ends with the exfiltration of data. Most technologies today lack the sophistication of being able to sense the subtleties, reconstruct the anatomy and visualize the attack profile.
New solutions can help in the reconstruction through the dissection of the attack. For instance, a machine or host can be compromised utilizing a series of DNS requests. Once a host is identified, it can be commandeered to serve as an active and malicious botnet in the targeted attack. Big data analytics can correlate large volumes of DNS network traffic, identify anomalous DNS behavior and connect it to suspicious domains. Big data analytic solutions can profile the attack much earlier, and trigger the response process in time.
A third use case category for big data is risk management and the notion of continuous risk scoring. This trend is currently in place for the Federal Government and is well documented by NIST standards. The financial industry looks to be pursuing a similar path. The platform has four basic levels: sensors, database, analysis/scoring and reporting. Continuous risk scoring requires the ability to effectively take in large data volumes which are then manipulated and massaged to accurately derive a score. New technologies hold promise of improving traditional means of improving the veracity of the risk and of reducing the overall calculation time.
From a very broad perspective, most business decisions have the potential to benefit from data crunching and analytics. Changes to the security program, from strategies and roadmaps to business operations, controls, and measurements, can be simulated utilizing big data analytics. The impact of these changes to the business can be scored for risk to make timely decisions.
These are just a few examples of useful applications for financial firms. What is clear is the “potential” of big data and all that can be envisioned. However, there are three base elements that challenge the overall viability and value of its realization.
Volume, Velocity and Variety of Data
An organization must utilize and embrace the power of data today with its soaring volume, velocity and variety and not run from it. As big data warehouses are architected and constructed, they can be designed to break down business and technology silos, merging data across traditional boundaries that have limited the perspective of what is happening in an environment.
The problem of sensing a “dried pea under a mountain of mattresses” threatens to grow substantially worse as the world becomes hyper-connected. There are two technologies that have actually been around for some time, that now have much improved performance and computing capabilities, making it possible to simplify some of the variety (structured and unstructured) of data and its velocity (streams of data flowing over constantly running queries). Technologies that enable Big Data processing, such as Hadoop and stream computing capabilities, provide deep contextual meaning to data in a much improved and actionable time-frame.
Hadoop is an open source data framework that allows applications to work with thousands of computation-independent computers and large amounts of data. Interestingly, Hadoop was originally developed by engineers working at Yahoo who were tasked with supporting Internet search engines. Hadoop provides flexibility as it is schema-less and can store any type of data, structured or not. Data from multiple sources can be glued together in simple ways, thus allowing for greater interrogation.
In order to make use of this eclectic assortment of unstructured text data, technology must be able to extract and contextualize the data. It must be able to selectively pull out “nouns and verbs, subjects and predicates” and determine meaningful (opinions and sentiment) conclusions.
For example: “Joe Sonders came into the bank three times this week. He deposited cash just below the regulated limit each time.”
These conclusions must then compound further into scenarios related to people, locations and businesses.
“Great” you say but there is more. If a scenario is not corroborated by other scenarios with similar facts and figures, it runs the risk of leading to a false conclusion. As a specific scenario about a person evolves, it is important to corroborate it with like events to avoid reaching incorrect judgments about events, judgments that could range from credit worthiness to criminal activity. By comparing scenarios, the identification of similarities improves the veracity of the analytics. See Figure 1 below where an individual deposit may not appear unusual, but in the context of multiple deposits, it may well be an exception.
The next advancement is to merge relevant text data from Hadoop with video scene analysis. Running traditional statistical analyses such as decision tree and linear regressions on text, video and structured data, is powerful stuff especially when fused with scenario corroboration.
Further, streaming analytic technologies provide the ability to perform trend analyses on structured and unstructured data in motion. Capturing data in motion greatly enhances the time to respond as it sees the data coming into the environment.
Getting value from big data and analytics
In summary, there are four capabilities required to derive actionable value from big data and analytics:
- Effectively extract data at rest and in motion
- Derive meaning and knowledge from the acquired data store
- Construct opinions and patterns based on multiple scenarios resulting in situational awareness
- Perform predictive modeling using scientific method reasoning
These use cases and technical capabilities can now be evaluated and applied, but firms must ensure that big data is just not bigger data.
If you are an “influencer” in your financial firm and you have the authority to propel your big data analytics program, here is where you need to start:
- Understand the data you have
- Ensure the data is accurate
- Evaluate your capabilities and technologies to perform the tasks described above
- Demonstrate value by using conclusions to drive actions
- Develop and distinguish data quality team
While the value proposition is clear, FSS organizations must have the “smarts” to make sure they are strategically planning the desired outcomes and creating the organizational structure to support big data. This means garnering the skills, constructing the right technology platforms, following through on the actionable analytics, and then measuring and promoting the business impact.