GASP: A Pan-Specific Predictor of Family 1 Glycosyltransferase Acceptor Specificity Enabled by a Pipeline for Substrate Feature Generation and Large-Scale Experimental Screening

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Harding-Larsen, D; Madsen, CD; Teze, D; Kittila, T; Langhorn, MR; Gharabli, H; Hobusch, M; Otalvaro, FM; Kirtel, O; Bidart, GN; Mazurenko, S; Travnik, E; Welner, DH
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- بيانات النشر:
  AMER CHEMICAL SOC
- الموضوع:
  2024
- Collection:
  The University of Melbourne: Digital Repository
- نبذة مختصرة :
  Glycosylation represents a major chemical challenge; while it is one of the most common reactions in Nature, conventional chemistry struggles with stereochemistry, regioselectivity, and solubility issues. In contrast, family 1 glycosyltransferase (GT1) enzymes can glycosylate virtually any given nucleophilic group with perfect control over stereochemistry and regioselectivity. However, the appropriate catalyst for a given reaction needs to be identified among the tens of thousands of available sequences. Here, we present the glycosyltransferase acceptor specificity predictor (GASP) model, a data-driven approach to the identification of reactive GT1:acceptor pairs. We trained a random forest-based acceptor predictor on literature data and validated it on independent in-house generated data on 1001 GT1:acceptor pairs, obtaining an AUROC of 0.79 and a balanced accuracy of 72%. The performance was stable even in the case of completely new GT1s and acceptors not present in the training data set, highlighting the pan-specificity of GASP. Moreover, the model is capable of parsing all known GT1 sequences, as well as all chemicals, the latter through a pipeline for the generation of 153 chemical features for a given molecule taking the CID or SMILES as input (freely available at https://github.com/degnbol/GASP). To investigate the power of GASP, the model prediction probability scores were compared to GT1 substrate conversion yields from a newly published data set, with the top 50% of GASP predictions corresponding to reactions with >50% synthetic yields. The model was also tested in two comparative case studies: glycosylation of the antihelminth drug niclosamide and the plant defensive compound DIBOA. In the first study, the model achieved an 83% hit rate, outperforming a hit rate of 53% from a random selection assay. In the second case study, the hit rate of GASP was 50%, and while being lower than the hit rate of 83% using expert-selected enzymes, it provides a reasonable performance for the cases when an expert ...
- ISSN:
  2470-1343
- Relation:
  http://hdl.handle.net/11343/352821
- الدخول الالكتروني :
  http://hdl.handle.net/11343/352821
- Rights:
  https://creativecommons.org/licenses/by-nc-nd/4.0 ; CC BY-NC-ND
- الرقم المعرف:
  edsbas.8CF790BB

تعليقات

No Comments.

GASP: A Pan-Specific Predictor of Family 1 Glycosyltransferase Acceptor Specificity Enabled by a Pipeline for Substrate Feature Generation and Large-Scale Experimental Screening

اتصل بنا

اتبع