De novo molecular design emerges as a beacon of innovation in drug discovery as we embark on a revolutionary expedition at the intersection of artificial intelligence and chemistry. Also known as generative chemistry, this automated process crafts novel chemical structures tailored to provoke specific biological responses while adhering to stringent pharmacokinetic principles. The surge in popularity of generative models in AI fuels this transformative journey, redefining the contours of drug development.

Molecular Design

In the pursuit of measuring the prowess of automated chemical structure generation, the establishment of standardized benchmark suites becomes paramount. Traditional metrics like drug-likeness fall short of encapsulating the intricacies of real-world drug discovery. Experimental validations, exemplified by the synthesis and testing of novel inhibitors of cyclin-dependent kinase 2 (CDK2), present a tangible yet anecdotal dimension. Enter benchmark suites like MOSES and GuacaMol, offering a comprehensive evaluation landscape that encompasses distribution learning tasks and goal-directed challenges. While invaluable, the challenge lies in evolving benchmarks to mirror the complexity of real-world use cases.

As computational methods evolve, the choice of molecular representation becomes the linchpin in evaluating chemical structures. Text-based methods, epitomized by SMILES, and graph-based approaches each bring unique strengths to the table. SMILES, while commonplace, grapples with non-uniqueness, leading to the emergence of adaptations like DeepSMILES and SELFIES for machine learning compatibility. The granularity of molecular representation, spanning atomic-level encoding to coarser representations involving functional groups or reactions, presents a nuanced challenge that echoes the delicate balance between precision and practicality.

Gradient-Free Molecular Optimization

The graph-based genetic algorithm (GB-GA) and ChemGE stand as exemplars in the field of atom-based de novo design. While the former employs reaction SMARTS for mutation and crossover, the latter leverages grammatical evolution on SMILES with remarkable success. Molecular Swarm Optimization (MSO) introduces a particle swarm approach, showcasing the ability to identify desirable regions in a continuous embedding space. Challenges persist in maintaining structural diversity within the population, addressed by innovative methods like MolFinder and graph-based elite patch illumination (GB-EPI).

Fragment-based approaches offer a constrained yet effective avenue for de novo design. MOARF and the CReM framework showcase the use of retrosynthetic disconnection rules and chemically reasonable mutations, presenting viable alternatives. These methods, rooted in fragment-based principles, exhibit comparable performance to their atom-based counterparts, providing a glimpse into the diverse strategies within de novo design.

Undeniably practical, reaction-based strategies involve forward reactions in silico. Pioneered by SYNOPSIS in 2003, recent developments like AutoGrow4 and reaction class recommender by Ghiandoni et al. refine this approach. These methods, navigating the complex web of chemical reactions, illuminate the potential and challenges in designing molecules through iterative synthesis.

Gradient-Based Molecular Optimization

In a landscape dominated by deep learning, gradient-based molecular optimization emerges as a dominant force. The use of variational autoencoders (VAEs), generative adversarial networks (GANs), and recurrent neural networks (RNNs) heralds a new era in learning to generate molecular structures.

Atom-based generative models, predominantly leveraging SMILES, harness deep learning architectures such as RNNs. This approach allows models to learn the grammar and syntax of valid SMILES, with recent advancements incorporating reinforcement learning for optimal molecule generation. Graph-based models like GraphVAE and MolGAN present an alternative, considering the topology of molecular graphs, demonstrating a departure from traditional SMILES-based methods.

While atom-based models boast flexibility, fragment-based approaches with reduced graph representations exhibit unique advantages. Models like JT-VAE introduce innovative steps, constructing junction trees before decoding final molecular structures. DeepFMPO refines the process by considering fragment similarity, showcasing the finesse of fragment-based de novo design.

Reacting to the challenges, reaction-based generative models like DINGOS and Molecule Chef combine machine learning with rule-based methodologies. Noteworthy is the advent of reinforcement learning in reaction-based design, exemplified by REACTOR and Policy gradient for forward synthesis (PGFS). As these models navigate the intricate landscape of chemical reactions, the fusion of learned reaction schemas with universal optimizers emerges as a tantalizing prospect.

Unveiling New Frontiers

As we delve into the coarseness of molecular representation, the convergence of AI methodologies and chemistry reshapes the landscape of de novo molecular design. The journey involves not just assessing methods but envisioning a future where benchmarks align seamlessly with real-world drug discovery needs. The molecular representation, a linchpin in this endeavor, undergoes constant evolution to meet the demands of generative models. With each step, we inch closer to realizing the full potential of de novo molecular design, unlocking novel compounds, and revolutionizing drug discovery paradigms.

Study DOI: 10.1016/j.drudis.2021.05.019

Engr. Dex Marco Tiu Guibelondo, B.Sc. Pharm, R.Ph., B.Sc. CpE

Editor-in-Chief, PharmaFEATURES

Share this:

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settings