SandboxAQ, the AI-first startup spun out of Alphabet and backed by tech powerhouses including Google and NVIDIA, has announced a major breakthrough in computational drug discovery. The company has released a dataset of 5.2 million synthetic, three-dimensional molecules, designed to predict how potential drug candidates might bind to proteins—a critical step in therapeutic development.
The dataset, built using NVIDIA’s advanced processors, was entirely generated through AI simulations rather than traditional lab methods. It is also linked to experimental data, helping ensure biological relevance and accuracy.
“This is a long-standing problem in biology that we’ve all, as an industry, been trying to solve for,” said Nadia Harhen, General Manager of AI Simulation at SandboxAQ.
Unlike conventional lab-based datasets, this release aims to help train AI models that can replicate high-quality lab predictions at speed and scale, drastically reducing both time and cost in early-stage drug research.
Founded in 2022, SandboxAQ has already raised $950 million, with a recent $150 million infusion from Google, NVIDIA, and BNP Paribas. The company’s valuation now stands at $5.75 billion. Its Large Quantitative Models (LQMs) are being applied not only in life sciences but also in finance and cybersecurity—available through platforms like Google Cloud.
While large language models (LLMs) continue to dominate public discourse, SandboxAQ emphasizes that LLMs “lack the capabilities to precisely simulate the physical world.” Instead, the company is focused on delivering a full-stack AI solution purpose-built for end-to-end drug discovery—from identifying molecular targets to predicting toxicity—prioritizing long-term value and real-world outcomes.
This move underscores the growing role of simulation-driven AI in transforming how new medicines are discovered, tested, and ultimately brought to market.