One of the fundamental requirements for big data users is big data control. Failure to properly store, audit and maintain data chain of custody undermines our individual and collective privacy. This failure also may be at odds with federal law and policy.

Without data control, there is no data compliance. Fortunately, there is a host of big data analytics models that are inherently far more respectful of our privacy than others.

Big Data, Big Privacy Challenges

The fundamental challenge of big data and big privacy is that predicative analytics tools are most effective when they capture and integrate maximum types of data, such as voice, video, geolocation, biometric, structured and unstructured text. Of course, the fundamental challenge for big privacy is that co-mingling and integration of data increases the likelihood that individuals’ personally identifiable information (PII) will be exposed and shared with unauthorized parties.

Threats to privacy increase exponentially when governments and commercial users lose chain-of-custody control of their data or become reliant on closed, proprietary systems that hold data hostage in vendor networks. That’s why it’s important to choose big data analytics architectural models that are open, do not require customers to surrender their data to vendors, allow governments and commercial clients to decide who can see aggregated data and predicative findings, and calibrate the level of anonymization of PII to their need.

Spotting Flimsy Frameworks

Models that are most respectful of data control requirements do not require governments and commercial sector companies to turn over control of their data to vendors. Under less successful data control models, third-party data scientists use proprietary algorithms to conduct their own analyses of data before returning it to the original owner for positive control.

Some closed, proprietary data analytics models also charge by volume of data analyzed. In a world where the amount of available data to aggregate, correlate and predict is increasing exponentially, charging by volume is great for vendor profits, but not so good for clients captured in this expense model.

Four Pillars of Effective Data Control

Good data control frameworks begin with several core precepts. Precepts that strive for maximum compliance with the intent and spirit of our privacy and civil rights are more sustainable in the long term.

Effective frameworks are built on the following non-negotiables:

1. Open Architecture

No one vendor has all the answers, and the very best capabilities reside across the entire big data analytics enterprise. Innovation is too fluid and too fast to lock into one company’s closed intellectual property. Opening base architectures to new ideas, capabilities and innovation is vital to the vibrant tools that can exist within a strong data control framework.

2. Total Ownership of Big Data

No model that requires turning over appropriately collected data to a third-party vendor can be strong on data control. Even if the vendor model is sound on data control, it is impossible for independent auditors to assess those controls if they cannot fully see the data chain-of-custody process and evaluate the veracity of the vendor’s secret algorithm.

3. Customizable Anonymization and Minimization

Different users have different requirements for the protection of PII. Those responsible for detecting insider threats and correlations for national security purposes have a special responsibility to protect data because they are required to access the most private data about persons of concern. People like me who hold security clearances waive certain privacy protections, so data anonymization and minimization requirements are different from the expectations of the traveling public.

Data control systems must be customizable. One size of closed, proprietary frameworks certainly does not fit all.

4. Sharing Determinations Made by the Data Owner

Strong data control models let the owner determine who can access the data and with whom it should be shared, both in its raw and correlated forms. Models that mandate sharing all data with third-party providers are, by definition, weak data control frameworks.

Fantasy World or Bright Security Future?

In a strong data control world, vendors provide exquisite data analytics tools that are auditable, customizable, owned in totality by the customer and agile enough to incorporate innovation from across the technology spectrum. Weak data control models that drive customers to transfer control of their data to proprietary, third-party vendors will struggle, since data owners must always have positive control of their sensitive information.

The strong data control world is not a fantasy. It does, in fact, exist. Adherence to this model is a win-win for government agencies and consumers seeking to leverage strong privacy protections and premier data analytics without ceding control of their data.

More from Data Protection

How to craft a comprehensive data cleanliness policy

3 min read - Practicing good data hygiene is critical for today’s businesses. With everything from operational efficiency to cybersecurity readiness relying on the integrity of stored data, having confidence in your organization’s data cleanliness policy is essential.But what does this involve, and how can you ensure your data cleanliness policy checks the right boxes? Luckily, there are practical steps you can follow to ensure data accuracy while mitigating the security and compliance risks that come with poor data hygiene.Understanding the 6 dimensions of…

Third-party access: The overlooked risk to your data protection plan

3 min read - A recent IBM Cost of a Data Breach report reveals a startling statistic: Only 42% of companies discover breaches through their own security teams. This highlights a significant blind spot, especially when it comes to external partners and vendors. The financial stakes are steep. On average, a data breach affecting multiple environments costs a whopping $4.88 million. A major breach at a telecommunications provider in January 2023 served as a stark reminder of the risks associated with third-party relationships. In…

Communication platforms play a major role in data breach risks

4 min read - Every online activity or task brings at least some level of cybersecurity risk, but some have more risk than others. Kiteworks Sensitive Content Communications Report found that this is especially true when it comes to using communication tools.When it comes to cybersecurity, communicating means more than just talking to another person; it includes any activity where you are transferring data from one point online to another. Companies use a wide range of different types of tools to communicate, including email,…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today