Computational modelling is becoming increasingly popular for data analysis in life sciences. Vast areas of therapeutic research are taking advantage of machine learning (ML) approaches for disease predictions and pathology. Cancer image analysis and diabetes case prediction are a few of the latest innovations.
ML is a branch of artificial intelligence showing exciting applications across drug development. With each exposure to new data, an ML machine-learning algorithm grows increasingly better at recognising patterns over time. There are two main techniques used to apply ML: supervised and unsupervised learning. Unsupervised learning is a type of algorithm that learns patterns from data without tags (annotations). Supervised learning algorithms, on the other hand, are formed from labelled training data which consists of a set of training examples. In others words, supervised learning relies on human intervention to label data in order to train the model to search for a specific component – cancer image analysis for example. Unsupervised learning on the other hand, analyses vast amounts of data which has not been labelled in order to identify associations or trends.
Supervised learning methods have been used to predict future values of data categories or continuous variables. Unsupervised learning is primarily used for exploratory purposes in the development of models to enable data clustering in a format not specified by the user. This particular technique helps to identify hidden patterns within input data, whereas supervised learning methods predict future outputs based on a trained model of known input and output data.
Within this study, an ML-based prediction model was used to identify DM signatures prior to onset. Signatures for DM could be biomarkers, for example, or blood-based factors like serum proteins.
These signatures would be identified through the data analysis of nationwide health records of patients from 2008-2018 via the ML prediction-based model. The model utilised a type of ML known as gradient-boosting decision trees. A gradient-boosting decision tree (GBDT) model is typically a prediction-based form of AI used to calculate the likelihood of interactions.
The study identified a total of 4,696 new diabetes patients (7.2%) from datasets. Their ML model predicted the future incidence of diabetes with an overall accuracy of 94.9%.
Diabetes mellitus is a chronic disease and increases the risk of developing diseases such as cancer and atrial fibrillation, which can be fatal. Hence, predicting diabetes in the population could prevent potential cases through medication or diet control. In the long-term, this would reduce the likelihood of said patients developing serious diseases as a result of diabetes, which would theoretically reduce the pressure on healthcare systems around the globe.
The study demonstrated how this ML approach could learn to predict the patterns of chromatin opening across 81 stem and differentiated cells across the immune system, solely from the DNA sequence of regulatory regions.
Chromatin is the material which constitutes a chromosome composed of DNA and protein.
Open chromatin regions reflect quite closely gene expression in the corresponding cells, hence why these areas are a target for cell identification in the immune system.
This deep learning approach has shown to be an important tool for immunology researchers, revealing modalities and complex patterns of immune transcriptional regulators that arise directly from the DNA sequence. Immune transcriptional regulators play a critical role in the maintenance of the immune system. These factors primarily control gene expression for various immune cells, thus have been implicated in autoimmune disorders when the immune system malfunctions.
In addition to immunology, deep learning approaches in oncology are becoming increasingly popular across basic and clinical cancer research.
Deep learning approaches have brought significant advancements to cancer image analysis. Early-stage cancer is often difficult to detect, especially so with conventional technology and human error, thus ML approaches like convolutional neural networks could potentially analyse images with greater speed and accuracy.
There are a number of challenges, however, with the deep learning approach to image analysis. Firstly, differences in colour tone on pathology slides may occur across different institutions due to the type of staining and sample preparation protocols: i.e. it presents an issue if one research lab uses colour x to stain their samples to highlight cancerous regions but the ML model has been trained from images of samples stained with Y, it could be difficult to accurately detect cancer as it is not the same staining. Therefore, it is necessary to “standardize color tones in digital slides for the development of accurate AI algorithms”.
A 2017 study successfully trained a CNN to classify skin cancer with a level of competence comparable to dermatologists. Using only pixels and disease labels as inputs, they classified skin lesions via a single CNN. The 1.4 million pre-training and training images in this study overcame photographic variability like zoom and lighting. This is a huge step forward for cancer imaging. The development of an accurate ML model for image analysis could support medical practitioners and patients to “proactively track skin lesions and detect cancer earlier”. Early detection significantly impacts cancer prognosis for many patients and MLapproaches like this could save many lives.
Charlotte Di Salvo, Former Editor & Chief Medical Writer PharmaFEATURES
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settings
Search for
Search Result(0)
Search for
Search Result(0)
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities...
Necessary
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non Necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.