AI/ML Approach for Predictive Modeling of RNA Structures and Chemical Reactivity Profiles

Jatin Sarabu, IntelliScience Training Institute
Sohail Zaidi, San Jose State University

Abstract

RNA, predominantly a single-stranded molecule, is integral to gene expression, with its folding into secondary structures playing a vital role in processes such as transcription, translation, and splicing. These secondary structures are essential for RNA's biological functions and are closely linked to various genetic diseases and malignancies. Understanding these structures is critical for identifying therapeutic targets and advancing disease treatment, with the potential to make significant societal impacts.Traditional bioinformatics methods for predicting RNA secondary structures often face limitations in accuracy and scalability. In contrast, machine learning, particularly neural network models, has emerged as a game-changing approach, capable of identifying complex patterns in large datasets with improved precision.This study leverages a convolutional neural network (CNN) to predict RNA secondary structures and chemical reactivity profiles. Using a dataset of 1,118,5113 RNA sequences obtained from Kaggle, the CNN model was trained to detect key structural features. The model achieved a mean absolute error of 0.28, demonstrating superior predictive performance compared to traditional methods.The results underscore the transformative potential of machine learning in RNA research. By enhancing our ability to predict RNA secondary structures, this work provides valuable insights into RNA functionality and its implications in disease treatment. The study highlights how cutting-edge technologies like machine learning can accelerate biomedical discoveries, ultimately paving the way for innovative therapeutic strategies targeting RNArelated diseases.