AlphaFold: a gargantuan leap


Cells are considered a basic unit of life. Inside every cell in a body, billions of tiny molecular machines are hard at work. They are Proteins. Proteins are made up of a sequence of amino acids. An average protein has about 300 amino acid residues. Proteins compose structural and motor elements in the cell. Proteins underlie every biochemical reaction that occurs in living things.

Source of the image

If we consider that there are twenty different amino acids, the combinatorial number of protein sequences that can be made is astronomically high; by the most conservative calculation, the human body synthesizes at least 30,000 different kinds of proteins.

Currently, there are 200 million known proteins, and each year we find new ones in millions. Each one of them expertly performs a specific task. Some are structural, lending stiffness and rigidity to muscle cells or long thin neurons. Others bind to specific molecules and shuttle them to new locations, and still, others catalyse reactions that allow cells to divide and grow. This wealth of diversity and specificity in function is made possible by a seemingly simple property of proteins: they fold.

Source of the image

The protein-folding problem has broadly three subsections:

1) How is the 3D native structure of protein determined by the physiochemical properties that are encoded in a 1D sequence of amino acids?

2) Despite having an unfathomable number of possible conformations. How proteins can fold so fast?

3) Can we devise a computer algorithm to predict a protein’s native structure from a given amino acids sequence? Such an algorithm might get around the time-consuming process of an experimental protein structure determination.

The second part of the problem is stated in 1968, by Cyrus Levinthal as how a protein molecule can fold to its one precisely defined low free energy native state so quickly, despite the huge number of conformations accessible to it. How does the protein know what conformations not to search? Called famous Levinthal’s Paradox.

The general principle solution of the needle in a haystack conundrum comes from polymer statistical thermodynamics. Studies of the chain entropies in models of foldable polymers showed that more compact, low-energy conformational ensembles have fewer conformations indicating that protein-folding energy landscapes are funnel-shaped.

Funnel-shaped energy landscape

Now the third section of the above problem which computer-based prediction has got its pace after the formation of the Critical Assessment of protein Structure Prediction (CASP) founded by Moult and colleagues in 1994; CASP held every second summer, CASP is a community-wide blind competition in which typically more than 100 different “target sequences” (of proteins whose structures are known but not yet publicly available) are made available to a community that numbers more than 150 research groups around the world.

This challenge inspires Deepmind, a London based subsidiary of Alphabet Inc. Deepmind have been applying an Artificial Intelligence system known as AlphaFold to predict the 3D structure of proteins. Deepmind has been participating since 2016. And year by year they are improving the model based on the CASP evaluation method known as Global distance test score(GDT).

Source of the Image

Among the teams that participated in CASP13 (2018), AlphaFold placed first in the protein structure prediction challenge. At CASP14 in 2020, they proposed a new version of AlphaFold which have reached a level of accuracy considered to solve the protein structure prediction problem.

Source of the image

Is it big deal? Yes, of course, the AlphaFold AI system is a big deal. Which is transforming and will transform molecular biology thoroughly. “This is a big deal.”, said John Moult, who himself working on this problem and co-founder of CASP.

Is AlphaFold accelerating the research in the complementary field?

The ability to accurately predict protein structures from their amino-acid sequence would be a huge boon to life sciences and medicine. It would vastly accelerate efforts to understand the building blocks of cells and enable quicker and more advanced drug discovery.

In the case of neurodegenerative disorders, the discovery two decades ago of what drives them changed the field: all of them — including Alzheimer’s, Parkinson’s, Huntington’s and amyotrophic lateral sclerosis (ALS or Lou Gehrig’s disease) — involve the accumulation of misfolded proteins in brain cells. AlphaFold would help to study what drives the misfolding of the proteins. And many more fields would be transformed due to this AI system.

I hope that this blog inspired you to explore this field and the AlphaFold AI system further. Please feel free to comment down your thoughts, feedback or suggestions if any below. Thank you!




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store