Data is in many ways the lifeblood of the digital economy. High-quality data oftentimes requires significant detail which may be at odds with the privacy concerns of the human subjects from whom data is extracted. The tension between the usefulness of a dataset and the data subject’s privacy has been referred to as the “privacy-utility tradeoff.” A novel application of artificial intelligence has potentially made it possible to resolve this tradeoff through the creation of “synthetic data,” anonymized data generated through general adversarial neural networks from authentic raw data. Unlike pseudonymized data, synthetic data retain properties that are statistically equivalent to the underlying data gathered from data subjects. As the cost of compliance with privacy laws across the world increases, synthetic data may prove to be a viable solution to the tension between protecting individual privacy rights and the demand in the big data market.
This Note argues that large BigTech companies should incorporate synthetic data into their business models to protect users’ private, personal data while retaining large profits derived their ad-driven business models. Part I provides an overview of GDPR, the patchwork of U.S. privacy laws, and recent caselaw that illustrates EU regulators’ strict approach to enforcement compared to their U.S. counterparts. Part II discusses how the Privacy-Utility Tradeoff and BigTech’s current business model renders compliance with data privacy regulations difficult. Part III explains how synthetic data can be used to resolve the Privacy-Utility Tradeoff.