Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Length Independent PAC-Bayes Bounds for Simple RNNs

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Apprentissage automatique avec intégration des connaissances en ingénierie de surface : théorie et algorithmes (MALICE); Laboratoire Hubert Curien (LHC); Institut d'Optique Graduate School (IOGS)-Université Jean Monnet - Saint-Étienne (UJM)-Centre National de la Recherche Scientifique (CNRS)-Institut d'Optique Graduate School (IOGS)-Université Jean Monnet - Saint-Étienne (UJM)-Centre National de la Recherche Scientifique (CNRS)-Inria Lyon; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria); McGill University = Université McGill Montréal, Canada; Institut québécois d’intelligence artificielle (Mila); Département d'Informatique et de Recherche Opérationnelle Montreal (DIRO); Université de Montréal (UdeM); ANR-20-CE23-0020,TAUDoS,Théorie et algorithmes pour la compréhension du deep sur des données séquentielles(2020)
    • بيانات النشر:
      HAL CCSD
    • الموضوع:
      2024
    • Collection:
      Institut d'Optique Graduate School, ParisTech: HAL
    • الموضوع:
    • الموضوع:
      Valence, Spain
    • نبذة مختصرة :
      International audience ; While the practical interest of Recurrent Neural Networks (RNNs) is attested, much remains to be done to develop a thorough theoretical understanding of their abilities, particularly in what concerns their learning capacities. A powerful framework to tackle this question is the one of PAC-Bayes theory, which allows one to derive bounds providing guarantees on the expected performance of learning models on unseen data. In this paper, we provide an extensive study on the conditions leading to PAC-Bayes bounds for non-linear RNNs that are independent of the length of the data. The derivation of our results relies on a perturbation analysis on the weights of the network. We prove bounds that hold for β-saturated and DS β-saturated SRNs, classes of RNNs we introduce to formalize saturation regimes of RNNs. The first regime corresponds to the case where the values of the hidden state of the SRN are always close to the boundaries of the activation functions. The second one, closely related to practical observations, only requires that it happens at least once in each component of the hidden state on a sliding window of a given size.
    • Relation:
      hal-04488664; https://hal.science/hal-04488664; https://hal.science/hal-04488664/document; https://hal.science/hal-04488664/file/PAC_Bayes_bound_for_SRNs__Camera_ready_.pdf
    • الدخول الالكتروني :
      https://hal.science/hal-04488664
      https://hal.science/hal-04488664/document
      https://hal.science/hal-04488664/file/PAC_Bayes_bound_for_SRNs__Camera_ready_.pdf
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.E1470E5D