Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Probabilistic Models and Natural Language Processing in Health

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Artés Rodríguez, Antonio; Martínez Olmos, Pablo; UC3M. Departamento de Teoría de la Señal y Comunicaciones
    • الموضوع:
      2022
    • Collection:
      Universidad Carlos III de Madrid: e-Archivo
    • نبذة مختصرة :
      The treatment of mental disorders nowadays entails a wide variety of still non-solved tasks such as misdiagnosis or delayed diagnosis. During this doctoral thesis we study and develop different models that can serve as potential tools for the clinician labor. Among our proposals, we outline two main lines of research, Natural Language Processing and probabilistic methods. In Chapter 2, we start our thesis with a regularization mechanism used in language models and specially effective in Transformer-based architectures, where we call it NoRBERT, from Noisy Regularized Bidirectional Representations from Transformers [9], [15]. According to the literature, we found out that regularization in NLP is a low explored field limited to the use of general mechanisms such as dropout [57] or early stopping [58]. In this landscape, we propose a novel approach to combine any LM with Variational Auto-Encoders [23]. VAEs belong to deep generative models, with the construction of a regular latent space that permits the reconstruction of the input samples throughout an encoder and decoder networks. Our VAE is based in a prior distribution of a mixture of Gaussians (GMVAE), what gives the model the chance to capture some multimodal information. Combining both, Transformers and GMVAEs we build an architecture capable of imputing missing words from a text corpora in a diverse topic space as well as improve BLEU score in the reconstruction of the data base. Both results depend on the depth of the regularized layer from the Transformer Encoder. The regularization in essence is formed by the GMVAE reconstruction of the Transformer embeddings at some point in the architecture, adding structure noise that helps the model a better generalization. We show improvements in BERT[15], RoBERTa [16] and XLM-R [17] models, verified in different datasets and we also provide explicit examples of sentences reconstructed by Top NoRBERT. In addition, we validate the abilities of our model in data augmentation, improving classification accuracy and F1 ...
    • Relation:
      https://doi.org/10.48550/arXiv.2006.02734; https://doi.org/10.2196/17116; https://doi.org/10.1016/j.jad.2021.02.059; https://doi.org/10.1192/bjo.2021.43; https://doi.org/10.48550/arXiv.2108.10764; http://hdl.handle.net/10016/36480
    • الدخول الالكتروني :
      http://hdl.handle.net/10016/36480
    • Rights:
      Atribución-NoComercial-SinDerivadas 3.0 España ; http://creativecommons.org/licenses/by-nc-nd/3.0/es/ ; open access
    • الرقم المعرف:
      edsbas.F40CEC9C