Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Turbocharging protein binding site prediction with geometric attention, inter-resolution transfer learning, and homology-based augmentation

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Springer Science and Business Media LLC, 2024.
    • الموضوع:
      2024
    • نبذة مختصرة :
      Background Locating small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many drug-discovery scenarios. Since it is not always easy to find such binding sites using conventional methods, different deep learning methods to predict binding sites out of protein structures have been developed in recent years. The existing deep learning based methods have several limitations, including (1) the inefficiency of the CNN-only architecture, (2) loss of information due to excessive post-processing, and (3) the under-utilization of available data sources. Methods We present a new model architecture and training method that resolves the aforementioned problems. First, by layering geometric self-attention units on top of residue-level 3D CNN outputs, our model overcomes the problems of CNN-only architectures. Second, by configuring the fundamental units of computation as residues and pockets instead of voxels, our method reduced the information loss from post-processing. Lastly, by employing inter-resolution transfer learning and homology-based augmentation, our method maximizes the utilization of available data sources to a significant extent. Results The proposed method significantly outperformed all state-of-the-art baselines regarding both resolutions—pocket and residue. An ablation study demonstrated the indispensability of our proposed architecture, as well as transfer learning and homology-based augmentation, for achieving optimal performance. We further scrutinized our model’s performance through a case study involving human serum albumin, which demonstrated our model’s superior capability in identifying multiple binding sites of the protein, outperforming the existing methods. Conclusions We believe that our contribution to the literature is twofold. Firstly, we introduce a novel computational method for binding site prediction with practical applications, substantiated by its strong performance across diverse benchmarks and case studies. Secondly, the innovative aspects in our method— specifically, the design of the model architecture, inter-resolution transfer learning, and homology-based augmentation—would serve as useful components for future work.
    • ISSN:
      1471-2105
    • الرقم المعرف:
      10.1186/s12859-024-05923-2
    • Rights:
      CC BY
      URL: http://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (http://creativecommons.org/licenses/by/4.0/) .
    • الرقم المعرف:
      edsair.doi.dedup.....56e0f5fa0c0566d1f728a81ed0f13d42