Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Secure Protocols for Best Arm Identification in Federated Stochastic Multi-Armed Bandits

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Laboratoire d'Informatique de Grenoble (LIG); Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ); Université Grenoble Alpes (UGA); Université de Bordeaux (UB); Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS); Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne (UCA)-Institut national polytechnique Clermont Auvergne (INP Clermont Auvergne); Université Clermont Auvergne (UCA)-Université Clermont Auvergne (UCA); ANR-19-P3IA-0003,MIAI,MIAI @ Grenoble Alpes(2019); European Project: H2020,INODE; European Project: 952215,TAILOR(2020)
    • بيانات النشر:
      HAL CCSD
      Institute of Electrical and Electronics Engineers
    • الموضوع:
      2023
    • Collection:
      Université Grenoble Alpes: HAL
    • نبذة مختصرة :
      International audience ; The stochastic multi-armed bandit is a classical reinforcement learning model, where a learning agent sequentially chooses an action (pull a bandit arm) and the environment responds with a stochastic reward drawn from an unknown distribution associated with the chosen action. A popular objective for the agent is to identify the arm having the maximum expected reward, also known as the best arm identification problem. We address the security concerns that occur in a cross-silo federated learning setting, where multiple data owners collaborate under the orchestration of a server to execute a best arm identification algorithm. We propose three secure protocols, which guarantee desirable security properties for the: input data (i.e., reward values), intermediate data (i.e., sums of rewards), and output data (i.e., ranking of arms and in particular the identified best arm). Each protocol has a different architecture, uses different techniques, and proposes a different trade-off with respect to several criteria that we thoroughly analyze: number of participants, generality of the supported reward functions, cryptographic overhead, and communication cost. To build our protocols, we rely on secure multi-party computation, AES-CBC, and the additive homomorphic property of Paillier.
    • Relation:
      info:eu-repo/grantAgreement//H2020/EU/863410/INODE; info:eu-repo/grantAgreement//952215/EU/Foundations of Trustworthy AI - Integrating Reasoning, Learning and Optimization/TAILOR; hal-03595189; https://inria.hal.science/hal-03595189
    • الرقم المعرف:
      10.1109/TDSC.2022.3154585
    • الدخول الالكتروني :
      https://inria.hal.science/hal-03595189
      https://doi.org/10.1109/TDSC.2022.3154585
    • Rights:
      http://creativecommons.org/licenses/by/
    • الرقم المعرف:
      edsbas.D4C6C88B