The GitHub of Data
Our Investment in Gretel
Having access to and learning from real world data is the key to rapid innovation, yet the modern enterprise devotes precious little time to it. Data is either completely walled off from broad access, or is left open for all to copy within an enterprise. There are no effective tools for sharing data in a controlled fashion: either to make it simple to anonymize a dataset or to generate a synthetic dataset with the distribution characteristics of the original data but with no sensitive data. While tech companies with lots of engineering resources can spend the time and effort to solve this problem, the average company can’t do this. This has resulted in it becoming challenging for developers to safely share and collaborate with sensitive data.
Code—which once faced many of the same challenges—now has well understood processes and tools for how we protect, share and change it. There’s been some amazing progress, with companies like GitHub making tools for code collaboration available to all. We believe there’s a real opportunity to bring the same kind of methodology and tooling to data, while protecting privacy and ensuring safety.
That’s why we’re thrilled to announce our investment in Gretel, an open platform that enables developers to build quickly with data and safely share it with others. After participating in the seed earlier this year I’m thrilled to lead the Series A and join the board for Greylock.
Gretel aspires to be the GitHub of data. The team believes that privacy is an engineering problem and is on a mission to solve it. With Gretel, developers are able to anonymize sensitive data so that it can be shared. They are also able to use the sensitive data to generate synthetic data—artificial data that closely matches the original data while guaranteeing privacy! Developers can use anonymized and synthetic data to build realistic test environments, train machine learning algorithms, and enable experimentation on anonymized data without manually redacting sensitive information.
Gretel was co-founded by Alex Watson (CEO), John Myers (CTO), Ali Golshan and Laszlo Bock, a team of former Amazon engineers and tech industry leaders with an impressive background in security and data. While I know Laszlo from our time together at Google, I knew quickly after meeting the team that it was a group to bet on. The team brings an exciting level of passion to the problem, knows how to move fast, and is articulate in describing their product. And Alex is no stranger to entrepreneurship. He previously founded Harvest.ai and sold the business to Amazon in 2016.
Gretel launched its public beta in August and is working with developers in over two dozen industries from healthcare, to financial services, gaming, transportation, security, developer productivity tools and more. Developers use Gretel in all sorts of ways, like creating synthetic patient medical record sets that can be safely shared between medical organizations, and balancing gender and ethnicity in their datasets to create a more fair and inclusive gaming experience.
While it is early days for the company, I’ve heard time and time again from software engineers and data scientists about the value Gretel could offer. I look forward to partnering with the team as they help developers share data more safely.