It was nearly 1.8 million years ago when man’s cognitive development reached a pivotal milestone in that he could coordinate and shape complex information.  For the first time, man demonstrated his spatial concept skills and created tools of his own design: axes and cleavers. Monkeys couldn’t do this; they could only use tools such as sticks to ferret out ants.  This leap distinctively separated man from the monkey.

The Data Scientist

We are again entering a new era and it’s the dawn of Big Data.  Man has invented new tools and technologies that can reshape information in ways that were previously inconceivable.   Tools like Hadoop can be exploited to their greatest value by those who also have a capacity to transform data. Today’s evolved data scientists can demonstrate their knowledge of statistical and mathematical algorithms, linear algebra, data structures design, system optimization, and architecture.   These Data Scientists are able to employ skills and technology in the real world of finance to stop fraudulent consumer transactions in real time.   They are able to predict with a high degree of certainty which employees are likely to commit crimes against their employers.  They are able to reconstruct Advanced Threat attacks that are so complex that most go undetected by traditional means.  They are able to cross the ubiquitous banking channels and see a digital diversion at the front door and a data thief at the back door.   The possibilities are limited only by the imagination.

In a time when there is already a skills shortfall of security professionals, organizations are in a position where they have to prioritize their needs.  As Big Data transforms security, enabling defenders to contend with a highly advanced threat environment, the role of scientists must move to the top of this list.

As the sector marches forward in a time of doubling the amount the data it analyzes every year1, the placement of this role in a holistic position comes none too soon.

The general definition of the scientist is someone who collates disparate data, discovers commonalities, and presents them to invested business entities.  These excogitative folks have long been part of the business, mostly in the business intelligence arena such as investment trading, credit risk assessments, and portfolio management.

One of the questions organizations must answer is where to find these rare skills, and whether to reach within the organization or to outsource it.  In-house resources have a striking benefit with their knowledge of the existing business intelligence.  The business can leverage this experience with a significant payoff in their mission to herd, merge, and massage structured and unstructured data from different silos and venues.

Data Quality

There is evidence that many of today’s organizations do not have quality data, which can lead to inaccurate conclusions.  Businesses must ensure they are collecting data from all viable resources, including cloud, web, and mobile, or the analysis risks being skewed.   Most organizations must also provision a structured team of data quality analysts with a broad view of the business.   To ensure success of the data scientist, there must be a data quality program, whose mission is to ensure the data is understood and accurate.

Big Data Technology

To further ensure the success of the data scientist, organizations must have supporting tools and technologies that embrace the power of data with its soaring volume, velocity, and variety.  There are two main technologies that have much improved performance and computing capabilities: Hadoop and Streams.  Working together, they allow the union of disparate data, and the interrogation of data at rest and in motion.

If organizations are responding to their CEO imperatives to amplify their analytics capabilities, they can best position their track for success through the skills of security data scientists, the data quality team, and the right technology platforms.


1 Gartner – Information Security Is Becoming a Big Data Analytics Problem, written by Neil MacDonald, March 2012



more from Data Protection