Structural biology has long been a cornerstone of biological research, providing insights into the intricate three-dimensional (3D) structures of biological macromolecules, such as proteins and nucleic acids. This understanding is crucial for unraveling the fundamental mechanisms that govern life. Traditional experimental techniques like X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy have been invaluable in this quest. However, they often demand extensive resources in terms of labor, time, and finances, limiting their applicability to specific molecules or complexes. In recent years, the integration of artificial intelligence (AI), particularly deep learning (DL) methods, has breathed new life into structural biology by offering innovative computational solutions to these limitations.

Protein Folding: A Complex Conundrum

Protein folding, the process by which a linear sequence of amino acids spontaneously adopts its 3D structure, is a fascinating puzzle in structural biology. It holds the key to understanding how proteins carry out their diverse functions in living organisms. Despite the relatively small repertoire of amino acids, the vast number of ways they can arrange themselves leads to an astonishing array of protein structures. Predicting how a protein will fold based solely on its amino acid sequence is an ambitious goal but remains elusive.

The challenges in predicting protein structures stem from the complexities of amino acid interactions. Even minor variations in the sequence can drastically alter the protein’s final structure or result in a loss of function. Conversely, similar chemical properties among certain amino acids can lead to subtle structural differences. Moreover, the sheer number of possible conformations resulting from amino acid rotations presents a daunting task. Scientists have grappled with these challenges and, in doing so, have uncovered the need to move beyond traditional thermodynamic hypotheses to address the non-equilibrium and active nature of proteins in biological contexts.

Homology Modeling: Bridging the Gap

Homology modeling, also known as comparative modeling, has emerged as a powerful strategy to predict the structure of a target protein using experimentally determined structures of closely related proteins as templates. The principle behind this technique is that proteins with similar sequences tend to adopt similar structures. The process involves template selection, sequence alignment, model generation, and subsequent optimization and validation. The accuracy of homology models hinges on sequence similarity, but adopting strategies like utilizing multiple templates and implementing energy minimization and loop modeling can significantly enhance precision.

Biomolecular Structure Prediction Meets AI

The wealth of protein structural data has grown exponentially in recent years, thanks in part to repositories like the Protein Data Bank (PDB), housing over 200,000 experimentally determined 3D protein structures as of 2022. The UniProt Knowledgebase (UniProtKB) has further enriched our understanding by providing comprehensive protein sequences and functional annotations. This surge in data has paved the way for AI-powered approaches to take center stage.

MODELLER, developed in 1993, is a pioneering program that leverages accumulated structural data to generate homology models. SWISS-MODEL, another essential tool, employs QMEANDisCo for model quality estimation and has proven its effectiveness. I-Tasser, with its innovative combination of threading, ab initio modeling, and structure assembly, has garnered attention by excelling in international structure prediction competitions.

Collaborative efforts like the Critical Assessment of Protein Structure Prediction (CASP) have provided a valuable platform for evaluating AI-based biomolecular structure prediction research. DeepMind’s AlphaFold, for instance, introduced deep learning methods to enhance traditional fragment assembly techniques, resulting in remarkable improvements in structure prediction accuracy. AlphaFold2 raised the bar further with the Evoformer, a novel neural network block that improved the prediction of protein structures by capturing intricate spatial relationships. trRosetta integrated convolutional neural networks into the RosettaFold framework, predicting inter-residue distances and torsion angles directly from protein sequences and MSAs to generate protein structure models.

Advancing Structural Biology with AI

Beyond individual protein structures, AI has extended its reach to predict the structures of protein complexes. DeepMind’s AlphaFold-Multimer tackles this challenge, albeit with some limitations regarding coevolution patterns and heterodimeric complexes. Efforts like ESMFold, a language model-based approach, aim to enhance the accuracy of complex predictions.

The epigenetic dimension of protein structure (EDPS) represents another frontier for AI in structural biology. Membrane proteins, in particular, pose challenges, with template-based modeling often outperforming neural network-based approaches due to the influence of lipid bilayers on structure formation. Addressing these challenges requires further research to improve EDPS prediction, especially for membrane proteins lacking suitable templates.

In conclusion, the integration of AI into structural biology research has ushered in a new era of possibilities. While challenges persist, AI’s adaptability and evolving methodologies hold promise in overcoming these obstacles. Tools like AlphaFold and ESMFold represent innovative solutions. As AI technologies continue to advance, they are poised to play a pivotal role in decoding the mysteries of protein structures, revolutionizing structure-based drug discovery, and driving biological research into uncharted territory.

Engr. Dex Marco Tiu Guibelondo, B.Sc. Pharm, R.Ph., B.Sc. CpE

Editor-in-Chief, PharmaFEATURES

Share this:

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settings