نبذة مختصرة : Accurately identifying protein-RNA binding residues is crucial for deciphering molecular recognition mechanisms and advancing drug design. While Protein Language Models (PLMs) have shown promise in residue-level feature extraction, existing methods often overlook the complementary benefits of integrating multiple feature modalities, leaving room for improved predictive performance. In this study, we present MFEPre, a novel multi-feature fusion framework that synergistically combines sequence-based PLMs embeddings, graph-based structural representations, and conventional handcrafted features to enhance the prediction of protein-RNA binding residues. Specifically, MFEPre leverages ProtBert embeddings to capture evolutionary and contextual sequence patterns, employs Graph Attention Networks (GATs) to model residue-level topological interactions in protein structures, and integrates handcrafted features. These features are processed through a three-channel convolutional neural network and performs feature fusion in a fully connected layer to predict binding sites. The results showed that the area under ROC curve values of the MFEPre on the test datasets reached 0.827, indicating superior performance compared to other existing models. Ablation studies confirm that three categories of features are complementary, highlighting the importance of multi-feature fusion. Our work offers new perspectives on protein-RNA binding site prediction by unifying sequence, structure, and biochemical insights, offering a robust tool for biological research and drug design.
No Comments.