Is AI changing drug discovery? What the evidence says, beyond the hype
AI is not creating a new way of building biotech companies. It is making an existing way faster and cheaper, at least in therapeutics, where the evidence base is more developed. For everything else, it is too early to tell.
Everyone talks about a revolution. Every pitch deck claims AI is transforming drug discovery. Every conference panel promises a new era. But when you set aside the marketing and look strictly at peer-reviewed evidence, the picture is different, and more useful. AI is not creating a new way of building biotech companies. It is making an existing way faster and less expensive. At least in therapeutics, where the evidence base is more developed. For everything else, it is too early to tell.
The Phase I triumph and the Phase II cliff
The most striking finding in the data is a paradox. AI-discovered drug candidates perform dramatically better than traditional compounds in early-stage trials, and then the advantage disappears entirely.
According to Jayatunga et al. (2024, Drug Discovery Today) and Serrano et al. (2024, Pharmaceutics), early analyses report success rates as high as 80-90% for AI-discovered molecules in Phase I trials, substantially higher than the historical average of approximately 40-65% for all drugs. These figures are based on small sample sizes (dozens, not hundreds of assets) and are likely subject to selection bias, since only the most promising AI programmes tend to reach the clinic, often with strong pharma partners behind them. Still, the direction is consistent across multiple independent reviews: AI appears to be effective at optimising the molecular properties that determine safety.
But Phase II tells a different story. According to the same analyses, Phase II success rates for AI-discovered drugs are approximately 40%, broadly in line with historical ranges for all drug candidates, though direct comparison is confounded by the heavy concentration of AI pipelines in oncology, where baseline success rates are often lower. The efficacy hurdle, validating that a compound actually works in complex biological systems, remains as difficult as it has always been, though Phase II outcomes may partly be a lagging indicator of improvements in earlier discovery steps.
Sources: Jayatunga et al., 2024 (Drug Discovery Today); Serrano et al., 2024 (Pharmaceutics); Malheiro et al., 2025; Wilczok & Zhavoronkov, 2024; Niazi, 2025. Phase I AI range: 80-90%, shown as midpoint (85%). Traditional Phase I range: 40-65%, shown as midpoint (52%).
AI-discovered vs traditional drug candidates: success rates by clinical phase
| Clinical phase | AI-discovered candidates | Traditional candidates | Source |
|---|---|---|---|
| Phase I | 80-90% (early analyses, small N) | ~40-65% | Jayatunga et al., 2024; Serrano et al., 2024 |
| Phase II | ~40% | ~40% | Jayatunga et al., 2024; Malheiro et al., 2025 |
| Phase III+ | Insufficient data (very few AI candidates have reached this stage) | n/a | Wilczok & Zhavoronkov, 2024 |
| Regulatory approval | 0 de novo approvals as of early 2025 | n/a | Niazi, 2025 |
According to Malheiro et al. (2025, Pharmaceuticals), Wilczok & Zhavoronkov (2024, Clinical Pharmacology and Therapeutics), and Niazi (2025, Pharmaceuticals), no wholly novel small-molecule drugs designed by AI have received regulatory approval as of early 2025, though definitions of "AI-designed" vary and are not consistently tracked by regulators. Some repurposed drugs identified via AI (notably baricitinib for COVID-19) have reached the market through accelerated pathways, but these represent repositioning of existing molecules, not de novo molecular innovation.
Since 2015, at least 75-80 unique AI-discovered or designed drug candidates have entered human clinical trials globally, according to Malheiro et al. (2025), Jayatunga et al. (2024), and Arnold (2023, Nature Medicine), a number that is growing rapidly and counted differently across sources. Oncology dominates, accounting for over 70% of reported cases, according to Dermawan & Alotaiq (2025, Pharmaceuticals). Applications are expanding into neurology, rare diseases, and dermatology, but at much smaller scale.
Reported Phase I entries from the Consensus analysis research gap matrix. Oncology accounts for >70% of AI pipeline activity (Dermawan & Alotaiq, 2025). Counts for other areas are absolute numbers, not comprehensive pipeline totals.
Best-in-class AI programmes have also compressed preclinical timelines dramatically: some report reductions from several years to under two years from target identification to IND-enabling studies, according to Dermawan & Alotaiq (2025) and Arnold (2023), though these represent top-end outcomes rather than industry averages.
AI-discovered drug candidates by therapeutic area: pipeline and maturity
| Therapeutic area | Pipeline presence | Most advanced stage | Key trend | Source |
|---|---|---|---|---|
| Oncology | Dominant (>70% of reported candidates) | Phase II (multiple ongoing) | Largest field by volume and investment; solid tumours, novel mechanisms, optimised dosing | Dermawan & Alotaiq, 2025; Carini & Seyhan, 2024 |
| Neurology | 6 reported Phase I entries | Phase I/II | Growing focus, notably Alzheimer's disease | Alghamdi, 2025 |
| Rare diseases | 4 reported Phase I entries | Phase I | Emerging; high ROI for computational approaches where traditional R&D is uneconomic | Gangwal & Lavecchia, 2024 |
| Dermatology | 3 reported Phase I entries | Phase I | Early but expanding | Dermawan & Alotaiq, 2025 |
| Infectious disease | Not systematically tracked in these sources | Market (repurposing only) | Baricitinib for COVID-19 (BenevolentAI); repurposing, not de novo | Mirakhori & Niazi, 2025 |
| Metabolic / other | Variable | Preclinical-Phase I | Emerging as datasets diversify beyond oncology | Malheiro et al., 2025 |
Note: Phase I entry counts for neurology, rare diseases, and dermatology are absolute numbers from the research gap matrix in the Consensus analysis, not comprehensive pipeline counts. They indicate relative activity levels across therapeutic areas.
Not a revolution: a faster, cheaper way of doing the same thing
This is the finding that matters most for venture building, and it is the opposite of what most pitch decks claim.
There is no evidence yet that AI produces more effective drugs. What the data shows is that AI produces drug candidates that arrive at the same clinical decision point faster and with less capital. The reported Phase I success rates do not mean AI-discovered molecules are therapeutically superior. They mean AI is good at designing molecules that clear safety screening. The Phase II data suggests that once the question shifts from "is it safe?" to "does it work?", AI's advantage has not yet materialised, though Phase II outcomes may partly be a lagging indicator of improvements in earlier discovery steps. AI has not yet solved the biology bottleneck, though it may increasingly influence it as datasets and models mature.
But the capital efficiency gain is real and significant. If the best AI programmes compress preclinical timelines from four to five years to under two, and if Phase I clearance rates are as high as early analyses suggest, the cost of reaching a Phase II decision drops substantially. This does not reduce the risk of failure at Phase II. It reduces the price of finding out.
For a venture, that distinction changes everything. You can test more candidates with the same capital, or reach the same decision point with less dilution. The drug is not necessarily better. The economics of getting to the decision are better. In practice, many companies reinvest these efficiency gains into running more programmes rather than reducing total spend, lowering the cost per shot on goal, not the total R&D budget.
And crucially: no dominant new business model has emerged yet. The successful first-generation AI drug discovery companies do not operate in ways that are structurally new. Atomwise generates candidates and licences them to Merck. Insilico Medicine does the same with Taisho. BenevolentAI identified baricitinib for COVID-19 and partnered with AstraZeneca. These are molecule factories with a cost and speed advantage, not technology platforms with network effects. The deliverable is a compound at a specific stage, not access to an algorithm. The business models, licensing, partnership, dual-track pipeline, all existed well before AI entered drug discovery. Whether new structural models will emerge as the field matures remains to be seen.
The founder who says "our AI discovers revolutionary drugs" is making a promise that the current evidence does not support. The founder who says "our AI gets us to IND faster than traditional approaches, with reported Phase I clearance rates well above industry averages" is making a specific, evidence-backed claim that honestly describes what AI appears to deliver today: speed and capital efficiency in the early stages of a process whose later stages remain as difficult and expensive as ever.
Beyond drug discovery: where the evidence is thinner
In diagnostic imaging and digital pathology, AI algorithms have achieved sensitivity and specificity levels that match or exceed human performance in specific tasks. According to Mennella et al. (2024, Heliyon) and Tiwari et al. (2025, Molecular Cancer), the evidence is strongest in oncological, cardiac, and neurological imaging. But the qualifier "specific tasks" is important: AI outperforms humans in narrow, well-defined challenges, not as a general replacement for clinical judgement. A major bottleneck remains the lack of prospective, randomised clinical trials to validate these tools in real-world settings.
In personalised medicine, the integration of multi-omic and genomic datasets with AI enables treatment approaches that would be impossible to derive manually, according to Acosta et al. (2022, Nature Medicine) and Zahra et al. (2024, Drug Metabolism and Personalized Therapy). In clinical trial optimisation, evidence of impact is early-stage, according to Ocana et al. (2025, Biomarker Research).
Drawing venture building conclusions from these domains would be premature. The evidence base is not yet strong enough to distinguish between genuine clinical impact and promising proof-of-concept.
One important caveat applies across all domains: this analysis largely reflects current evidence from AI applied to molecular design and early discovery optimisation. Other layers of the stack, particularly target identification, biomarker strategy, and patient stratification, may have a more direct impact on efficacy outcomes over time, but the evidence in these areas remains less mature. If AI meaningfully improves target selection, Phase II outcomes could change. That is where most of the value in drug discovery sits, and it is too early to know whether AI will move that needle.
80-90% Phase I success. Timeline compression to under 2 years. Gap: efficacy validation (Phase II) unchanged.
Matches or exceeds human performance in specific tasks. Gap: lack of prospective, randomised clinical trials.
Enables targeted therapies via multi-omic integration. Gap: limited large-scale clinical validation.
Efficiency gains in recruitment and design. Gap: no prospective validation of AI-driven trial designs.
Based on systematic analysis of 50 peer-reviewed papers via Consensus (Semantic Scholar, PubMed). Evidence strength reflects consistency of findings, sample sizes, and validation level.
Like what you're reading?
Subscribe for more strategic notes on biotech and venture design.
The transparency paradox: where it applies and where it does not
Regulators increasingly demand algorithmic transparency in AI-driven healthcare tools (Warraich et al., 2024; Mennella et al., 2024; Goktas & Grzybowski, 2025). But this demand does not apply equally across domains, and conflating them is a common mistake.
In drug discovery, the paradox does not exist, at least not as a regulatory problem. Regulatory agencies evaluate the molecule, not the algorithm that designed it. An AI-discovered compound goes through the same IND filing, the same clinical trials, the same regulatory review as a traditionally discovered one. How the molecule was identified is essentially irrelevant to the approval process. The competitive moat sits in the molecule's IP (composition-of-matter patents, formulation, method-of-use), not in the opacity of the algorithm. That said, pharma partners do care about reproducibility and model interpretability when evaluating licensing deals, but this is a commercial consideration, not a regulatory barrier. A separate and still open question is whether regulators will develop specific validation standards for computationally derived molecules or endpoints, an uncertainty flagged in the literature (Warraich et al., 2024) that could evolve as more AI-discovered compounds advance through late-stage trials.
In diagnostics and clinical decision support, the paradox is real. When the AI algorithm is the product, a software that reads a mammogram, analyses a pathology slide, or recommends a treatment based on multi-omic data, the regulatory framework directly scrutinises the model itself. The EU AI Act, the FDA's AI/ML action plan, and the IVDR for Software as a Medical Device all require that the algorithm be explainable, auditable, and demonstrably free from bias. If the model must be transparent, the algorithm alone cannot be the competitive advantage. Where the defensibility sits for these companies is an open question the industry is still answering.
What the successful companies have in common
The peer-reviewed literature highlights two recurring factors in successful AI biotech companies: multidisciplinary team integration (computational biology, data science, and clinical expertise) and early collaboration with big pharma or academic centres of excellence (Mirakhori & Niazi, 2025, Pharmaceuticals; Bhushan & Misra, 2025, NPJ Digital Medicine). Extending this observation to the specific companies identified in our analysis, four operational patterns emerge:
Problem specificity
Solving one well-defined problem rather than building a broad platform. Atomwise built a model for molecular binding prediction on specific targets; PathAI focuses on specific cancer types. Specificity is what makes the pharma partnership possible.
Early pharma partnership
Industrial validation secured before or alongside fundraising: BenevolentAI + AstraZeneca, Insilico Medicine + Taisho, Atomwise + Merck. The partnership is not a commercial transaction, it is a validation event.
Multidisciplinary team from day one
Computational biology, data science, and clinical expertise integrated within the same organisation from inception, not sequenced. Companies that led with pure AI talent and added biology later consistently underperformed (Bhushan & Misra, 2025).
Regulatory awareness
Regulatory requirements (FDA, EU AI Act) factored into the development plan early, with attention to algorithmic transparency, bias, and data representativeness. Observed across successful companies, with implementation varying by domain (Warraich et al., 2024, JAMA; Mennella et al., 2024).
None of these characteristics are AI-specific. Problem specificity, early industrial partnership, multidisciplinary teams, regulatory discipline, these are venture building fundamentals that predate AI by decades. AI makes them more important, not less. And the absence of any genuinely new structural pattern in the successful companies reinforces the central finding: AI is not creating a new kind of biotech company. It is making the existing kind faster and cheaper to build.
The bottom line
The peer-reviewed evidence does not support the claim that AI is "revolutionising" biomedicine. Not yet, and perhaps not in the way the hype suggests. What it supports is a more specific and more honest conclusion.
In drug discovery, AI currently functions as a powerful accelerator. It designs molecules faster, optimises safety properties more effectively, and compresses preclinical timelines. The result, based on current evidence, is not necessarily better drugs but better economics: less capital and less time to reach the same clinical decision point. No dominant new business model has emerged yet. The cost structure is improved.
In diagnostics, personalised medicine, and clinical trial optimisation, the evidence is promising but too early for venture building conclusions.
One important caveat on the central thesis: this analysis largely reflects current evidence from AI applied to molecular design and safety optimisation. Other layers of the stack, particularly target identification and patient stratification, may have a more direct impact on efficacy outcomes over time. If AI meaningfully improves target selection, Phase II results could eventually shift. The data to assess this does not yet exist in sufficient volume, but it is worth watching closely.
Frequently asked questions
AI can compress your preclinical timeline. But the Phase II cliff means the venture architecture around the algorithm matters more than the algorithm itself. A 30-minute conversation is enough to map where the real risk sits.
Explore venture building