Leveraging Machine Learning for the Analysis and Prediction of Influenza A Virus

Influenza A virus (IAV) remains a persistent global health challenge due to its high mutation rate, zoonotic potential, and antigenic variation, making early prediction and classification crucial for disease surveillance and vaccine development. Traditional laboratory methods, such as hemagglutination inhibition assays and phylogenetic analyses, are often time-consuming and resource-intensive, limiting their real-time applicability. Machine learning has emerged as a powerful tool for analyzing viral sequences and improving predictive accuracy.
This research introduces novel AI-driven approaches to influenza prediction by developing multi-channel neural networks (MC-NNs) and exploring both alignment-based (PSSM) and alignment-free (word embedding) representations. While traditional bioinformatics relies heavily on sequence alignment, this work demonstrates that alignment-free approaches can achieve comparable performance, offering a faster and more scalable alternative.
To further enhance antigenic prediction, this research investigates semi-supervised learning techniques, addressing the challenge of limited labeled data in biomedical research. Additionally, the project examines the impact of reference database selection on model generalization, ensuring robustness across different influenza subtypes and host species.
By integrating these advances, this research aims to improve viral surveillance, early warning systems, and vaccine development strategies, demonstrating the potential of AI to transform computational virology.
For more details, please refer to my PhD thesis.