Recently published in the BMC Bioinformatics Journal, the research by Ranjan Kumar Barman, Anirban Mukhopadhyay, Ujjwal Maulik, and Santasabuj Das sheds light on the persisting challenge of infectious diseases. Despite significant progress in healthcare, these ailments, caused by a variety of pathogens, continue to burden public health and economies, particularly in low and middle-income countries. Understanding the intricate interaction between pathogens and host cells is pivotal in unraveling the complexities underlying infectious diseases.

Current genetic studies have predominantly revolved around single nucleotide polymorphisms (SNPs) and the identification of disease-associated genes available in public databases. However, the bulk of these efforts has concentrated on Mendelian diseases or complex conditions like asthma, diabetes, and cancer, leaving a notable gap in predicting host genes specifically associated with infectious diseases.

Scatter plot of significantly enriched gene ontology (GO) biological process terms, visualized by REVIGO summarizes and visualizes long lists of gene ontology terms. Retrieved from Barman, R., Mukhopadhyay, A., Maulik, U. & Das, S. (2019). Identification of infectious disease-associated host genes using machine learning techniques. BMC Bioinformatics 20, 736 (2019). doi: 10.1186/s12859-019-3317-0.

Addressing this gap, the study pioneers a new approach utilizing machine learning techniques, notably Deep Neural Networks (DNN). These techniques, recognized for their efficacy in diverse problem-solving, were compared against traditional classifiers such as Support Vector Machine, Naïve Bayes, and Random Forest. The research validated the model’s performance using independent datasets and ventured into unexplored territory by applying the model to proteins yet unutilized.

The architecture of simple Deep Neural Networks. Retrieved from Barman, R., Mukhopadhyay, A., Maulik, U. & Das, S. (2019). Identification of infectious disease-associated host genes using machine learning techniques. BMC Bioinformatics 20, 736 (2019). doi: 10.1186/s12859-019-3317-0.

The labyrinthine nature of pathogen-host interactions poses a challenge in understanding the mechanisms behind infectious diseases. However, computational approaches offer an efficient and cost-effective avenue to identify disease-associated genes, capitalizing on the wealth of information available in public repositories.

Despite the diverse clinical features of infectious diseases, they share commonalities like acute onset, transmissibility, unique immune responses, and varied reactions to antimicrobial agents. Host responses to infections significantly differ from non-infectious diseases, involving specific molecular patterns that trigger innate immune receptors. This study integrated sequence and protein-protein interaction (PPI) network properties, revealing the dominance of the latter in predicting infectious disease-related host genes.

Host-Pathogen Interaction in Sepsis. Goh, C. & Knight, J. (2017). Enhanced understanding of the host–pathogen interaction in sepsis: new opportunities for omic approaches. The Lancet: Respiratory Medicine, 5(3):212-223. doi: 10.1016/S2213-2600(17)30045-0.

Deep Neural Networks, particularly through the use of TensorFlow, demonstrated promising results, achieving high accuracies during training. However, their performance slightly decreased during testing, emphasizing their suitability for larger datasets and features. Ensemble feature selection techniques bolstered predictive capabilities, performing on par with using all features.

Introducing CNNs with Google’s TensorFlow. Marchildon, R. (2023). Building Neural Networks in TensorFlow. Retrieved from rpmarchildon.com/ai-cnn-digits/.

The model’s validity was established through comparisons with existing machine learning techniques (MLT)-based methods designed for other diseases like cancer and Alzheimer’s. The infectious disease-associated gene prediction model outperformed these disease-focused models, affirming its reliability in identifying disease-associated host genes.

Fitting of different distributions for different infectious diseases. Yadav, S. & Akhter, Y. (2021). Statistical Modeling for the Prediction of Infectious Disease Dissemination With Special Reference to COVID-19 Spread. Frontiers in Public Health, 9 (2021). doi: 10.3389/fpubh.2021.645405.

This study identified numerous infectious disease-associated host genes, laying the groundwork to expand our comprehension of disease pathogenesis. The discovered genes correlated with critical biological processes and disease ontology, suggesting potential common pathways among various diseases. This opens avenues for repurposing existing treatments to develop new host-targeted therapies for infectious diseases.

The proposed computational model, integrating sequence and PPI network properties, demonstrates its efficacy in predicting infectious disease-associated host genes. This method achieves high accuracy, emphasizing its potential in identifying disease risks and therapeutic targets, paving the way for a deeper understanding of diseases and innovative therapeutic strategies.

Study DOI: 10.1186/s12859-019-3317-0

Engr. Dex Marco Tiu Guibelondo, B.Sc. Pharm, R.Ph., B.Sc. CpE

Editor-in-Chief, PharmaFEATURES

Share this:

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settings