In November 2020, a team at Google DeepMind entered a biannual competition called CASP – the Critical Assessment of Structure Prediction – and effectively ended the contest. Their system, AlphaFold 2, scored above 90 on the benchmark's accuracy test for roughly two-thirds of all proteins it analyzed. The previous best efforts from decades of academic research hadn't come close. Scientists watching in real time described it as watching a 50-year-old problem get solved in an afternoon.
That moment is now 5 years old. In the time since, AlphaFold has become as standard a tool in molecular biology labs as a microscope. More than three million researchers across 190 countries have used it. Its database contains over 214 million predicted protein structures — covering nearly every known protein on Earth. In 2024, Demis Hassabis and John Jumper, the architects behind AlphaFold 2, shared the Nobel Prize in Chemistry. At 39, Jumper became one of the youngest chemistry laureates in 75 years.
And now, DeepMind is moving beyond proteins entirely.
The same lab that cracked the protein folding problem has turned its attention to the broader, messier challenge of interpreting the full human genome — not just the 2 percent that codes for proteins, but the other 98 percent that scientists once dismissed as "junk DNA." The result is AlphaGenome, published in Nature in January 2026, and it represents the most ambitious step yet in DeepMind's campaign to use AI as a general-purpose engine for biological discovery.
What AlphaFold Actually Solved

Proteins are the molecular machines that run every process in a living cell. They carry oxygen, fight infections, regulate gene expression, and break down food. A protein's function is almost entirely determined by its three-dimensional shape — the specific way a long chain of amino acids folds in on itself the moment it's produced. Nobel Prize-winning chemist Christian Anfinsen postulated back in 1972 that this shape should be fully predictable from a protein's genetic sequence alone. The problem is that figuring out how to make that prediction proved extraordinarily difficult.
For decades, determining a protein's structure required physical experiments: X-ray crystallography, cryo-electron microscopy, or nuclear magnetic resonance imaging. These methods are expensive, slow, and require specialized expertise. A PhD student might spend a year or more determining the structure of a single protein, at a cost of roughly $100,000. With hundreds of millions of known proteins, the math on experimentally determining them all was bleak.
AlphaFold changed that equation. Learning how to use AlphaFold to make protein structure predictions is now taught as a standard tool to many graduate-level biology students around the world. "It is just a part of training to be a molecular biologist," Jumper told Fortune. What once took a year in the lab now takes minutes on a laptop.
The system works by treating protein folding as a pattern recognition problem. Google DeepMind solved the problem by using a Transformer, the same kind of AI architecture that is the engine of popular chatbots like ChatGPT. AlphaFold 2's Evoformer module learns the relationships between amino acid pairs across the entire protein chain, building a probabilistic map of how each part of the structure likely interacts with every other. AlphaFold 3, released in 2024, extended this to include DNA, RNA, and small-molecule ligands — opening up the possibility of modeling how a drug interacts with its protein target before it ever enters a lab.
The Real-World Impact, Five Years In
The headline numbers are impressive, but the actual science happening on top of AlphaFold tells a more interesting story.

Scientists in Europe used AlphaFold to understand a key immunity protein in honeybees, Vitellogenin. These structural insights are now being applied to conservation efforts for endangered bee populations and guiding the development of AI-assisted breeding programs for healthier, more resilient pollinators. That kind of application — applied to conservation biology — was not what the AlphaFold team was thinking about when they built the tool.
Drug discovery has been the most commercially significant application. One of the most exciting examples is Isomorphic Labs, an AI drug discovery company founded in 2021 when the breakthrough model proved powerful enough to be applied to rational drug design. Isomorphic Labs, also a DeepMind spinoff, has developed what it calls a unified drug design engine built on AlphaFold 3's molecular modeling capabilities. The company has active partnerships with pharmaceutical companies pursuing treatments in oncology and metabolic disease.
Protein design — the inverse problem of AlphaFold, where you design a protein to have a target shape rather than predict the shape of a known one — has exploded as a field. David Baker and others have absolutely run with AlphaFold technology. Baker, who shared the Nobel with Hassabis and Jumper, pioneered computational protein design at the University of Washington. His lab and others have now produced synthetic proteins designed to break down plastics, neutralize pathogens, and deliver drugs more precisely than anything found in nature.
None of this was guaranteed. The AlphaFold team built a structure prediction tool. The community ran with it in directions nobody anticipated.
The Limits That Still Exist

AlphaFold is known to be less accurate at making predictions about multiple proteins or their interaction over time. Most drug targets involve not a single protein but a complex of several, or a protein interacting with a small molecule in a specific cellular environment. AlphaFold 3 has improved on this substantially, but the model still struggles with highly dynamic proteins that shift shape depending on context — something static structure prediction wasn't designed to capture.
There are many cases where you get a prediction and you have to kind of scratch your head. Is this real or is this not? It's not entirely clear — it's sort of borderline. That observation, from Kaspar Verba, a researcher who uses AlphaFold regularly, echoes a concern familiar from LLM research: the model gives its uncertain answers with the same confidence it gives its certain ones. AlphaFold's confidence scores (called pLDDT values) help, but interpreting them well requires experience.
There is also the open-source question that has shadowed AlphaFold since AlphaFold 3's announcement. The company caused controversy by describing AlphaFold 3 in an academic paper without releasing any of the code at publication time. The weights and model source were eventually made available for non-commercial research in early 2025, but the commercial restriction remains — an important limitation for startups trying to build on the platform. Open-source clones have emerged, including ByteDance's Protenix and the AlQuraishi Laboratory's OpenFold-3, both carrying permissive licenses that allow commercial applications.
AlphaGenome: The Next Frontier
While the scientific community was still digesting what AlphaFold 3 meant for drug discovery, DeepMind released something more ambitious: AlphaGenome.
The vast majority of the human genome — more than 98 percent — consists of DNA that does not build proteins. Once disregarded as "junk DNA," scientists now know that this molecular dark matter is crucial for determining gene activity in ways that keep us healthy or cause disease. Exactly how that regulatory DNA works has been one of biology's deepest unsolved questions. Small mutations in these non-coding regions can turn cancer suppressor genes off, activate developmental pathways at the wrong time, or cause rare diseases that have puzzled clinicians for decades.
AlphaGenome takes a DNA sequence up to one million base pairs long and predicts how mutations in that stretch affect gene expression. The new model can predict gene expression, DNA accessibility, histone modifications, transcription factor binding, and even the folding structure of the genome, with a high level of accuracy across sequences of DNA up to a million base pairs long. That breadth is what makes AlphaGenome unusual. Previous tools generally required a tradeoff: process a shorter sequence at high resolution, or a longer one at lower resolution. AlphaGenome's architecture resolves that tension, delivering both simultaneously.
Whereas other AIs can do some of this analysis for the estimated 2 percent of the genome in protein-coding genes, AlphaGenome is the first to manage the same feat for the full genome. "For the first time, an AI model can predict exactly where and how an RNA variant is expressed directly from a sequence of DNA," said Hani Goodarzi, a genomics AI model builder at the University of California San Francisco.
The benchmarks are strong. AlphaGenome matches or exceeds the strongest available external models in 25 of 26 evaluations of variant effect prediction. It was published in Nature in January 2026 and is currently available via a free API for non-commercial research use.
How AlphaGenome Works

The architecture builds on lessons from AlphaFold while adapting to the different nature of the problem.
AlphaGenome uses convolutional layers to initially detect short patterns in the genome sequence, transformers to communicate information across all positions in the sequence, and a final series of layers to turn the detected patterns into predictions for different modalities. Training data came from massive public biological databases including ENCODE, GTEx, 4D Nucleome, and FANTOM5, which collectively cover hundreds of human and mouse cell types and tissues.
A notable technical achievement is the model's ability to handle splice junctions — the points where genes get cut and reassembled into different forms of RNA. Splicing errors underlie a significant share of genetic diseases, and modeling them accurately at scale has been difficult for previous tools. AlphaGenome addresses this directly, making it particularly useful for studying rare Mendelian disorders where a single splicing mutation causes disease.
Training efficiency was also improved: a full AlphaGenome model was trained in just four hours on TPUs, using half the compute budget of DeepMind's earlier Enformer model, thanks to optimized architecture and data pipelines. For an AI model operating at genome scale, that is a meaningful efficiency gain.
What This Means for Drug Development and Medicine

In drug discovery, one of the most persistent problems is understanding why a genetic variant causes disease. A patient with a rare disorder may have thousands of DNA mutations compared to a reference genome. Identifying which mutation is actually causative requires expensive functional experiments, years of follow-up, and often still yields ambiguous results. AlphaGenome ranks the variants most likely to be consequential, allowing researchers to focus their follow-up studies.
For cancer research specifically, researchers have used AlphaGenome to pinpoint the mutations in cancer genomes that drive the proliferation of cancer. That kind of prioritization — narrowing thousands of variants to a tractable set for wet-lab validation — is where AI can create immediate, practical value without needing to fully solve the underlying biology.
Beyond rare disease and cancer, the genomics applications extend to cardiovascular medicine, neurological conditions, and aging biology. AlphaMissense and AlphaGenome use AI to assess the genetic mutations that underpin disease. The AlphaProteo model can design novel, high-strength protein binders that target diverse molecules, including those associated with cancer and diabetes. Together these tools are forming something closer to a coherent platform for computational biology than a collection of one-off models.
For pharmaceutical companies, the investment thesis is straightforward: if AI can reliably narrow the search space for therapeutic targets from millions of possibilities to thousands, and if drug design tools can model how a candidate molecule binds to its target before synthesis, the cost and timeline of drug development could compress dramatically. That is a multibillion-dollar value proposition, which is why Isomorphic Labs has attracted serious pharmaceutical partnership interest despite being only a few years old.
Risks, Limitations, and Open Questions

AlphaGenome is powerful, but its developers are explicit about what it cannot do.
The tool might predict that a given DNA variant has no effect on gene expression when in fact it does. Predicting how a disease manifests from the genome is an extremely hard problem, and this model is not able to magically predict that. False negatives — variants the model considers benign that are actually pathogenic — remain a real risk if clinicians treat model outputs as ground truth rather than as a filter for follow-up investigation.
The model was also trained on human and mouse genomes only. AlphaGenome is not applicable to other species yet. For researchers studying model organisms like zebrafish or fruit flies, or for agricultural applications, the tool has limited direct utility in its current form.
There are broader questions about data access and commercialization. The same tension that surrounded AlphaFold 3's licensing is present here. DeepMind provides AlphaGenome free for non-commercial research, but the path to therapeutic applications — the ones where it would have the greatest impact — runs through licensing arrangements that haven't been fully disclosed. The scientific community learned with AlphaFold that DeepMind's definition of "open" can shift depending on commercial considerations.
There are also legitimate concerns about privacy and misuse in a world where genomic data is becoming cheaper and more accessible. A model that can predict gene expression consequences from raw DNA sequence is a powerful tool in the right hands. Questions about who gets access, under what conditions, and with what safeguards are not merely regulatory formalities.
What Jumper Is Thinking About Next
John Jumper, now a director at Google DeepMind, is careful about what he predicts publicly. But the direction of his thinking is telling.
He says he will be shocked if there is not more and more LLM impact on science.
"I think that's the exciting open question that I'll say almost nothing about." DeepMind has already built AlphaEvolve, a system that uses a large language model to generate candidate solutions to scientific problems and a secondary model to evaluate them. Researchers have used it to make advances in mathematics and computer science. The idea of an AI system that can propose and evaluate scientific hypotheses — not just run predictions — is where the frontier appears to be heading.
Jumper said what excites him is the idea of using the power of LLMs to develop new hypotheses and design novel experiments to test them. DeepMind has created a prototype AI scientist based on Gemini that can do some of this. But Jumper thinks the concept has much more potential. "The really exciting dataset and the really big dataset is the entirety of the scientific literature," he said.
That vision — an AI that reads everything, generates hypotheses, designs experiments, and interprets results — is still mostly aspirational. But AlphaFold at 20 was also aspirational. What five years showed is that when the right architecture meets the right data and the right problem, the progress can be faster than almost anyone expected.
Future Outlook
DeepMind views AlphaFold as the template for how AI can accelerate all of science to digital speed. From fusion and Earth sciences to scientific discovery as a whole, the company is pursuing the next AlphaFold-like breakthroughs.
That ambition is credible in a way it might not have been before 2020. AlphaFold demonstrated that a hard, well-defined scientific problem with a clear benchmark and sufficient training data is vulnerable to deep learning in a way that decades of traditional research were not. The question for the next five years is which problems have that structure — and whether DeepMind, or someone else, can identify them before the competition does.
The bets being placed right now are on genomic interpretation, RNA biology, cell-scale simulation, and what DeepMind calls "digital biology" — a world where the full molecular state of a cell can be modeled computationally. None of these are solved problems. All of them are getting significantly harder to ignore.
AlphaFold at five is not a story with a clean ending. It is the beginning of a longer story about what happens when AI gets genuinely good at science. And if the first chapter is any guide, the next one is going to move faster than anyone predicted.
Frequently Asked Questions
What is AlphaFold and why is it important?
AlphaFold is an AI system developed by Google DeepMind that predicts the three-dimensional structure of proteins from their amino acid sequences. It is important because protein structure determines protein function, and understanding that structure is critical for drug discovery, disease research, and biotechnology. Before AlphaFold, determining a protein's structure experimentally could take a year or more and cost tens of thousands of dollars. AlphaFold can produce a prediction in minutes.
Why did AlphaFold win a Nobel Prize?
In 2024, Google DeepMind researchers Demis Hassabis and John Jumper shared the Nobel Prize in Chemistry for developing AlphaFold, which solved the protein structure prediction problem — a challenge that had stumped scientists for 50 years. The prize committee recognized the system's transformative impact on biology and medicine. David Baker of the University of Washington shared the other half of the prize for his work on computational protein design.
What is AlphaFold 3 and how is it different from AlphaFold 2?
AlphaFold 2, released in 2020, focused on predicting the structures of individual proteins. AlphaFold 3, released in 2024, extended this capability to model all of life's major molecules together, including DNA, RNA, and small-molecule ligands. This makes it substantially more useful for drug discovery, where the relevant question is often how a drug molecule interacts with a protein target rather than what the protein looks like in isolation.
What is AlphaGenome and what does it do?
AlphaGenome is a new AI model from Google DeepMind, published in Nature in January 2026, that predicts how genetic mutations affect gene expression across the full human genome. Unlike earlier tools that focused only on protein-coding DNA, AlphaGenome can analyze up to one million DNA base pairs at once and predict effects across the entire genome, including the 98 percent that does not code for proteins. It is currently available free for non-commercial research via an API.
How does AlphaFold compare to competing tools like ESMFold and RoseTTAFold?
AlphaFold 2 remains the most widely cited and validated protein structure prediction tool. ESMFold from Meta is faster at inference and uses a fully open license, making it preferable for some commercial applications. RoseTTAFold, from David Baker's lab, has its own strengths and a broader license than AlphaFold 3. For AlphaFold 3 specifically, open-source alternatives like ByteDance's Protenix have emerged with more permissive licensing for commercial use.
How is AlphaFold being used in drug discovery?
AlphaFold is being used to identify potential drug targets by revealing the structures of proteins associated with disease, to model how drug candidates bind to those proteins, and to design novel proteins with therapeutic properties. Isomorphic Labs, a spinout from Google DeepMind, has built a drug design platform on top of AlphaFold 3 and has active partnerships with major pharmaceutical companies. Academic researchers are also using it to accelerate cancer research, rare disease investigation, and vaccine development.
What are the limitations of AlphaFold?
AlphaFold is less reliable when predicting the structures of protein complexes (multiple proteins interacting together) compared to single proteins. It also struggles with highly dynamic proteins that shift shape depending on cellular context. Like any AI model, it can produce confident-looking predictions that are incorrect, and its confidence scores require scientific experience to interpret properly. AlphaFold 3's commercial licensing restrictions also limit its accessibility for startups and commercial applications.
What comes after AlphaFold in AI-driven biology?
The current focus at Google DeepMind extends to full genomic interpretation (AlphaGenome), protein binder design (AlphaProteo), and AI systems capable of generating and evaluating scientific hypotheses. The longer-term ambition is something closer to a computational cell model — an AI that can simulate the molecular state of a cell accurately enough to predict what happens when you intervene in it. Whether that proves achievable within the next five years is the central question in computational biology right now.
Related Articles





