TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
- Webinars
  - Expert Panel Exploring Best Practices for Unified Data Management January 13, 2025
  - De-Risking Innovation: Safely Adopting GenAI January 14, 2025
  - Building Reliable Data and AI Systems January 15, 2025
  - Talking Business to Your Data: Conversational Analytics January 16, 2025
- Virtual Summits
  - Virtual Events TDWI Virtual Summit Series: Generative AI in Action: Lessons Learned from Successful Implementations December 9, 2024
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
Train
- In-Person Events
  - Conference TDWI Transform West - Las Vegas December 13, 2024
  - Executive Summit TDWI Modern Data Leader's Summit Las Vegas: Modern Data Foundations: Essential Strategies for AI Success December 20, 2024
- Virtual Live Seminars
  - Seminar Data Architecture Essentials: Building a Data Foundation for Enterprise Analytics November 26, 2024
  - Seminar Getting Started with AI in Your Organization November 26, 2024
  - Seminar Data Modeling Essentials November 26, 2024
  - Seminar ChatGPT 101 for Business Users November 26, 2024
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

How to Let Your Data Lake Flow Without Fear in a World of Privacy Regulations

Data lakes pose compliance challenges. Here's how to overcome them.

By Drew Schuil
February 3, 2020

Today's biggest companies are facing a deluge of data breaches, from the social media giants to credit card companies and healthcare organizations. In fact, the first six months of 2019 saw more than 3,800 publicly disclosed breaches and 4.1 billion compromised personal records. These breaches, along with the misuse and abuse of private information, continue to erode consumer trust. In response, companies are developing solutions to implement privacy and security controls that track, block, and restrict access to personal data.

For Further Reading:

Why Encryption Holds the Secret to Data Security

Modern Metadata Management

How to Survive the Coming Data Privacy Tsunami

As the public becomes increasingly aware of data breaches and how personal information is being stolen, organizations and their customers are asking how and why personal data is being used. Inquiries are coming in the form of data subject requests (DSRs). Even though data might be king, privacy compliance is ruling the kingdom. It's become more important than ever to understand these questions and how to address the ever-increasing volume of DSRs.

The Surge of Data Privacy Concerns

Regulations such as the General Data Protection Regulation (GDPR) and the upcoming California Consumer Privacy Act (CCPA) are forcing companies to respond to DSRs and answer consumer concerns about privacy (and rightfully so). However, achieving compliance with these regulations requires that companies understand what personal information they have across every ecosystem, where it's located, and how it's being used.

Data lakes are useful repositories for gathering massive amounts of data in its original format, with the idea that the data will eventually be subject to analysis, but privacy risks lurk within these systems. These huge storage repositories can pose serious problems when a customer submits a DSR. Data lakes are continuously ingesting disparate pieces of customer data from a variety of sources, so organizations often have no clue which sensitive information they have and how it is being combined.

For example, individual pieces of data can be safe on their own, but when combined can increase compliance risk. For example, gender, ZIP code, and date of birth fields are individually benign, but when combined can identify 87 percent of the United States population.

Using Automation to Monitor Data Lakes

To know and understand exactly what information is in their data lakes, enterprises need to inspect their data down to the data-element level and not rely on what's implied by their metadata. When operating at that level, enterprises can also identify highly sensitive combinations of data across their ecosystem to protect against security risks and remain in compliance.

To protect themselves from data lake compliance issues, organizations should implement automated data privacy management solutions to quickly identify where personal information is located across their systems. If organizations continue to use outdated manual processes, they risk human error caused by the constant stream of data being poured in and privacy teams working long hours to manually organize each piece of information.

Enterprises also need to monitor all data that enters and exits their systems -- continuously checking, scanning, and classifying data in motion. An automated data inventory and privacy solution can help in this effort and use de-identification or anonymization to prevent data analysts from connecting individuals to their personal information. In this way, data can still be used to drive business innovation without compromising privacy.

Protecting Data Use to Remain in Compliance

Regulations such as the GDPR and CCPA also require that data be encrypted to preserve the confidentiality and integrity of sensitive information. With the massive volume of data in a company's data lake and even more live data continuously streaming in, most traditional encryption validation tools quickly become obsolete.

Manual tools can't track data in motion. As soon as you manually track a piece of data, new data is already entering the system. This method only creates a snapshot for that moment in time, not accounting for any new information moving through the system. Data needs to be classified, labeled, and mapped back to the encryption requirements dictated by both these regulations and the organization's internal use policies to remain in compliance.

A Final Word

In an era of increasing data privacy regulations, it's more important than ever for organizations to know what sensitive data is held within their data lakes and repositories as well as what's traveling throughout their systems. Consumers now demand transparency, accuracy, and expediency when asking companies about what data they have and are collecting. It's imperative for companies to have the proper tools in place to accurately and responsibly handle data while responding to DSRs in a timely and efficient matter.

About the Author

Drew Schuil is the VP of global BD & EMEA operations for Integris Software. For the last 19 years, he has held key leadership positions with enterprise software and cybersecurity companies. Prior to Integris Software, he was VP of global product strategy at Imperva, a data-centric audit and protection software firm, meeting with companies and speaking at industry events in 43 countries. Exposure to global privacy sentiment and the GDPR led to joining data privacy innovator Integris Software just as regulations such as the California Consumer Privacy Act (CCPA) began driving heightened privacy awareness in the U.S.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

How to Let Your Data Lake Flow Without Fear in a World of Privacy Regulations

Related Articles

Trending Articles

What’s Ahead in Generative AI in 2025? (Part Two)

What’s Ahead in Generative AI in 2025? (Part One)

Curb Your Hallucination: Open Source Vector Search for AI

4 Practical Tips to Create Value with AI

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

How to Let Your Data Lake Flow Without Fear in a World of Privacy Regulations

Related Articles

Trending Articles

What’s Ahead in Generative AI in 2025? (Part Two)

What’s Ahead in Generative AI in 2025? (Part One)

Curb Your Hallucination: Open Source Vector Search for AI

4 Practical Tips to Create Value with AI

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career