One of the fundamental requirements for big data users is big data control. Failure to properly store, audit and maintain data chain of custody undermines our individual and collective privacy. This failure also may be at odds with federal law and policy.

Without data control, there is no data compliance. Fortunately, there is a host of big data analytics models that are inherently far more respectful of our privacy than others.

Big Data, Big Privacy Challenges

The fundamental challenge of big data and big privacy is that predicative analytics tools are most effective when they capture and integrate maximum types of data, such as voice, video, geolocation, biometric, structured and unstructured text. Of course, the fundamental challenge for big privacy is that co-mingling and integration of data increases the likelihood that individuals’ personally identifiable information (PII) will be exposed and shared with unauthorized parties.

Threats to privacy increase exponentially when governments and commercial users lose chain-of-custody control of their data or become reliant on closed, proprietary systems that hold data hostage in vendor networks. That’s why it’s important to choose big data analytics architectural models that are open, do not require customers to surrender their data to vendors, allow governments and commercial clients to decide who can see aggregated data and predicative findings, and calibrate the level of anonymization of PII to their need.

Spotting Flimsy Frameworks

Models that are most respectful of data control requirements do not require governments and commercial sector companies to turn over control of their data to vendors. Under less successful data control models, third-party data scientists use proprietary algorithms to conduct their own analyses of data before returning it to the original owner for positive control.

Some closed, proprietary data analytics models also charge by volume of data analyzed. In a world where the amount of available data to aggregate, correlate and predict is increasing exponentially, charging by volume is great for vendor profits, but not so good for clients captured in this expense model.

Four Pillars of Effective Data Control

Good data control frameworks begin with several core precepts. Precepts that strive for maximum compliance with the intent and spirit of our privacy and civil rights are more sustainable in the long term.

Effective frameworks are built on the following non-negotiables:

1. Open Architecture

No one vendor has all the answers, and the very best capabilities reside across the entire big data analytics enterprise. Innovation is too fluid and too fast to lock into one company’s closed intellectual property. Opening base architectures to new ideas, capabilities and innovation is vital to the vibrant tools that can exist within a strong data control framework.

2. Total Ownership of Big Data

No model that requires turning over appropriately collected data to a third-party vendor can be strong on data control. Even if the vendor model is sound on data control, it is impossible for independent auditors to assess those controls if they cannot fully see the data chain-of-custody process and evaluate the veracity of the vendor’s secret algorithm.

3. Customizable Anonymization and Minimization

Different users have different requirements for the protection of PII. Those responsible for detecting insider threats and correlations for national security purposes have a special responsibility to protect data because they are required to access the most private data about persons of concern. People like me who hold security clearances waive certain privacy protections, so data anonymization and minimization requirements are different from the expectations of the traveling public.

Data control systems must be customizable. One size of closed, proprietary frameworks certainly does not fit all.

4. Sharing Determinations Made by the Data Owner

Strong data control models let the owner determine who can access the data and with whom it should be shared, both in its raw and correlated forms. Models that mandate sharing all data with third-party providers are, by definition, weak data control frameworks.

Fantasy World or Bright Security Future?

In a strong data control world, vendors provide exquisite data analytics tools that are auditable, customizable, owned in totality by the customer and agile enough to incorporate innovation from across the technology spectrum. Weak data control models that drive customers to transfer control of their data to proprietary, third-party vendors will struggle, since data owners must always have positive control of their sensitive information.

The strong data control world is not a fantasy. It does, in fact, exist. Adherence to this model is a win-win for government agencies and consumers seeking to leverage strong privacy protections and premier data analytics without ceding control of their data.

More from Data Protection

3 Strategies to overcome data security challenges in 2024

3 min read - There are over 17 billion internet-connected devices in the world — and experts expect that number will surge to almost 30 billion by 2030.This rapidly growing digital ecosystem makes it increasingly challenging to protect people’s privacy. Attackers only need to be right once to seize databases of personally identifiable information (PII), including payment card information, addresses, phone numbers and Social Security numbers.In addition to the ever-present cybersecurity threats, data security teams must consider the growing list of data compliance laws…

How data residency impacts security and compliance

3 min read - Every piece of your organization’s data is stored in a physical location. Even data stored in a cloud environment lives in a physical location on the virtual server. However, the data may not be in the location you expect, especially if your company uses multiple cloud providers. The data you are trying to protect may be stored literally across the world from where you sit right now or even in multiple locations at the same time. And if you don’t…

From federation to fabric: IAM’s evolution

15 min read - In the modern day, we’ve come to expect that our various applications can share our identity information with one another. Most of our core systems federate seamlessly and bi-directionally. This means that you can quite easily register and log in to a given service with the user account from another service or even invert that process (technically possible, not always advisable). But what is the next step in our evolution towards greater interoperability between our applications, services and systems?Identity and…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today