Throughout history, every major technology wave has precipitated a new foundational infrastructure stack to maximize its potential.
The transition to cloud computing brought about new database technologies like MongoDB, developer tooling like Github and Gitlab, and observability tools like Datadog, while the surge in web development catalyzed collaboration products like Figma. Now, as large language models (LLMs) are becoming a fundamental building block for new applications, we’re witnessing the rise of an infrastructure stack tailored for this AI-driven era.
One of the biggest challenges with LLM applications is that they are non-deterministic in how they behave. Enterprises cannot reliably predict the quality of a model-generated response, how a small change to prompt will affect the output, or whether the underlying model has changed. This can lead to inconsistent user behavior which significantly limits the potential of production AI. As such, AI builders need a new infrastructure toolkit to help evaluate, debug, test, and implement AI models in their products before they’re launched. Unlike prior generations of AI, where ML quality was a nice-to-have but not needed – with generative AI, you cannot ship a product to users without confidence in quality.
Today, we are thrilled to announce our seed investment in Braintrust, an enterprise-grade tool for rapidly and reliably evaluating AI. Braintrust offers developers a toolkit for instrumenting code and running evaluations, enabling teams to assess, log, refine, and enhance their AI-enabled products over time. In doing so, Braintrust becomes the nexus of the product development lifecycle for AI application developers. Crucially, Braintrust facilitates this within a customer’s cloud environment, ensuring it can be used for even their most data-sensitive tasks. In an AI landscape dominated by Twitter hype, we were struck by the quality of customers that have already adopted and standardized on Braintrust: leading applied AI companies like Zapier, Airtable, Coda, Instacart, and many more are building with Braintrust to bring their AI products to market.
We are fortunate to be partnering with Ankur Goyal, the founder and CEO of Braintrust, who has spent his career building products for builders. Ankur initially led the engineering team at SingleStore, then founded Impira (an ML platform for unstructured data), which was acquired by Figma where he stayed on to lead the AI platform. On a personal note, I’ve known Ankur for nearly 8 years and deeply admired his approach to company building, incredible customer centricity, and ability to cater to both developers and large enterprises. We – at Greylock – have wanted to partner with him for many years. We are delighted that we get to be part of Braintrust’s journey alongside outstanding co-investors like Elad Gil, Alana Goyal and Basecase, and a unique set of angel investors across CEO’s of some of the most forward-thinking technology companies including Clem Delangue, CEO of HuggingFace; Greg Brockman, Co-founder and President of OpenAI; Howie Liu, CEO at Airtable; Jack Altman, CEO of Lattice; Guillermo Rauch, CEO of Vercel; Simon Last, CTO at Notion; and Olivier Pomel, CEO at Datadog.
Braintrust joins a distinguished set of companies that Greylock has partnered with from the early stages and have gone on to create and disrupt large enterprise software markets, such as Abnormal Security, AppDynamics, Figma, Okta, Palo Alto Networks, Workday, and many more. If you’re interested in using Braintrust, you can sign up for free and get started here, and if you’re interested in joining the team, see the open roles here.