Significant recent investment in computational technology has seen a number of new innovations arise in drug discovery – perhaps most notable machine learning (ML). By 2022, it is expected that AI technology will contribute $2.199 billion to pharma’s revenue, with popularity growing across the pharmaceutical industry. Target identification, validation and drug discovery are some of the areas in drug development in which machine learning has shown its potential.

Introduction

The pipeline from drug discovery to development to approval is a complex and lengthy process. However ML is beginning to show innovations in all stages of drug development. Target validation, identification of prognostic biomarkers and analysis of digital pathology data in clinical trials are some of the opportunities in which ML can be implemented.

There are two main techniques used to apply ML: supervised and unsupervised learning. Unsupervised learning is a type of algorithm that learns patterns from untagged data. Supervised learning on the other hand, is a type of algorithm formed from labeled training data which consists of a set of training examples.

Supervised learning methods have been used to predict future values of data categories or continuous variables. Unsupervised learning is primarily used for exploratory purposes in the development of models to enable data clustering in a format not specified by the user. This particular technique helps to identify hidden patterns within input data, whereas supervised learning methods predict future outputs based on a trained model of known input and output data. According to a 2020 review, supervised learning techniques such as Support Vector Machines, deep learning and regression methods have already been applied to biomedical challenges in the last decade.

Applications in drug development

Target identification and validation

The identification and validation of a therapeutic target requires the analysis of vast datasets. Genetic screening and high-content imaging are examples of techniques that produce large datasets that can be exploited for early target identification and validation. However analysis of such data requires appropriate mathematical methods to construct valid statistical models – this is where ML can be exploited.

As early as 2010, ML was applied in a study for target validation in the form of a “decision tree-based meta classifier”. In this study, the ML platform was proposed as a computational approach to predicting morbid and druggable genes. Morbid genes with mutations are associated with causing hereditary human disease. The tree-based meta-classifier was used to predict targets on a genome-wide scale. It managed to correctly recover “65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%”. The ability of ML to reliably predict specific genes on a genome-wide scale is a huge step forward in further optimising target identification. Prediction of therapeutic targets saves time and resources for pharma companies and potentially utilise the mathematical approach to predict more reliable targets.

Drug discovery

The Generative Adversarial Network (GAN) is an example of a recent innovation in deep learning for drug discovery. Deep learning is a specialised area of ML that attempts to model abstraction from large-scale data using multi-layered deep neural networks (DNNs). Abstraction is a computer science term that refers to the process of filtering out irrelevant data in order to focus on the desired information.

As an unsupervised ML method, GAN has proven to address the challenges of supervised ML, primarily the training of large data sets which is often expensive and time-consuming. In a 2017 study, GAN-based frameworks were used to develop and identify novel compounds for anticancer therapy with chemical and biological datasets.

This study emphasised how the productivity of pharmaceutical research is limited by inefficient early lead discovery processes. It also highlighted how in silico-based approaches like deep learning models can generate reliable data at a reduced cost and time scale relative to current screening methods.

Computational pathology

In research, a pathologist interprets the presentation of tissue/cells within a glass slide. The spatial context between cells, size and general cellular structure can be indicators of changes with drug interaction. Computational pathology is becoming an important part of drug development. It has been suggested that this method could allow pharmaceutical companies to discover novel biomarkers and generate them in a more precise, reproducible and high-throughput manner.

ML allows for high-throughput generation of features for thousands of cells, which is an impossible task for pathologists. Immuno-oncology is a particular therapeutic area which has benefitted from using computational pathology. A 2017 study found that computational analysis of tumour-adjacent benign tissue in prostate cancer revealed information typically ignored by pathologists but has been associated with progression-free survival.

Ongoing challenges in adopting AI/ML

One of the main concerns with ML predictions is overfitting or underfitting. Overfitting is described as a model which consists of “lower quality information/technique but generates higher quality performance. In contrast, underfitting models fail to recognize the data sets’ underlying trend and generalize the new data inputted”. Both errors produce inaccurate results which compromise the reliability of predicted drug targets. Increasing the sample size and cross-validation are often used to address these problems. Cross validation is a technique that uses independent data sets to estimate the accuracy of ML algorithms’ models.

Another challenge for the pharmaceutical industry is the lack of personnel to operate AI/ML-based platforms. Furthermore, there is often skepticism about the quality of data generated by AI. Small organisations are often limited in their budget so cannot afford to invest in AI/ML technology.

Despite the improvements needed to refine ML applications, the potential they bring to drug development is significant. In addition to reducing human error, the automation of ML software can analyse data from many sources more accurately and in a shorter period of time. The advancement of AI and ML will continue to reduce the challenges faced by the pharmaceutical industry.

Charlotte Di Salvo, Lead Medical Writer
PharmaFeatures

Sepsis Shadow: Machine-Learning Risk Mapping for Stroke Patients with Bloodstream Infection

Agentic Divide: Disentangling AI Agents and Agentic AI Across Architecture, Application, and Risk

scAInce Dawn: How Agentic AI and Autonomous Laboratories are Reshaping Scientific Discovery

Artificial Intelligence and Data Analytics

Implementation of Machine Learning in Drug Development

Introduction

Applications in drug development

Drug discovery

Computational pathology

Ongoing challenges in adopting AI/ML

Related Posts

Drug Discovery Biology

Proteolytic Rewriting: Engineering Controlled Absence of Pathogenic Protein Persistence

Artificial Intelligence and Data Analytics

Sepsis Shadow: Machine-Learning Risk Mapping for Stroke Patients with Bloodstream Infection

Artificial Intelligence and Data Analytics

Agentic Divide: Disentangling AI Agents and Agentic AI Across Architecture, Application, and Risk

Bioinformatics & Multiomics

Agentic Bioinformatics: How Autonomous AI Agents Compress Biomedical Discovery Cycles

Read More Articles

Degradation Code: Rewriting Pathology by Reprogramming Intracellular Protein Fate

Precision Myeloma: Clinical Utility of IKZF1/3 Degradation in Refractory and Frontline Multiple Myeloma Therapy

Spatial Collapse: Pharmacologic Degradation of PDEδ to Disrupt Oncogenic KRAS Membrane Localization

Neumedics’ Integrated Innovation Model: Dr. Mark Nelson on Translating Drug Discovery into API Synthesis

Zentalis Pharmaceuticals’ Clinical Strategy Architecture: Dr. Stalder on Data Foresight and Oncology Execution

Exelixis Clinical Bioanalysis Leadership, Translational DMPK Craft, and the Kirkovsky Playbook

Policy Ignition: How Institutional Experiments Become Durable Global Evidence for Pharmaceutical Access

Enduring Blockade: Five-Year Functional Antibody Persistence Against Emerging GII.4 and GII.17 Noroviruses

Sepsis Shadow: Machine-Learning Risk Mapping for Stroke Patients with Bloodstream Infection

Agentic Divide: Disentangling AI Agents and Agentic AI Across Architecture, Application, and Risk

scAInce Dawn: How Agentic AI and Autonomous Laboratories are Reshaping Scientific Discovery

Artificial Intelligence and Data Analytics

Implementation of Machine Learning in Drug Development

Introduction

Applications in drug development

Drug discovery

Computational pathology

Ongoing challenges in adopting AI/ML

Subscribe to get our LATEST NEWS

Related Posts

Drug Discovery Biology

Proteolytic Rewriting: Engineering Controlled Absence of Pathogenic Protein Persistence

Artificial Intelligence and Data Analytics

Sepsis Shadow: Machine-Learning Risk Mapping for Stroke Patients with Bloodstream Infection

Artificial Intelligence and Data Analytics

Agentic Divide: Disentangling AI Agents and Agentic AI Across Architecture, Application, and Risk

Bioinformatics & Multiomics

Agentic Bioinformatics: How Autonomous AI Agents Compress Biomedical Discovery Cycles

Read More Articles

Degradation Code: Rewriting Pathology by Reprogramming Intracellular Protein Fate

Precision Myeloma: Clinical Utility of IKZF1/3 Degradation in Refractory and Frontline Multiple Myeloma Therapy

Spatial Collapse: Pharmacologic Degradation of PDEδ to Disrupt Oncogenic KRAS Membrane Localization

Neumedics’ Integrated Innovation Model: Dr. Mark Nelson on Translating Drug Discovery into API Synthesis

Zentalis Pharmaceuticals’ Clinical Strategy Architecture: Dr. Stalder on Data Foresight and Oncology Execution

Exelixis Clinical Bioanalysis Leadership, Translational DMPK Craft, and the Kirkovsky Playbook

Policy Ignition: How Institutional Experiments Become Durable Global Evidence for Pharmaceutical Access

Enduring Blockade: Five-Year Functional Antibody Persistence Against Emerging GII.4 and GII.17 Noroviruses

Subscribe
to get our
LATEST NEWS