Proactive Detection of Voice Cloning with Localized Watermarking

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Roman, Robin San; Fernandez, Pierre; Elsahar, Hady; Défossez, Alexandre; Furon, Teddy; Tran, Tuan
المصدر:
Proceedings of the 41st International Conference on Machine Learning ; ICML 2024 - 41st International Conference on Machine Learning ; https://hal.science/hal-04610152 ; ICML 2024 - 41st International Conference on Machine Learning, PMLR, Jul 2024, Vienna, Austria. pp.1-17
الموضوع:
voice cloning; watermarking; [INFO]Computer Science [cs]
نوع التسجيلة:
conference object
اللغة:
English

معلومة اضافية
- Contributors:
  Facebook AI Research Paris (FAIR); Facebook; Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH); Inria Nancy - Grand Est; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD); Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS); Creating and exploiting explicit links between multimedia fragments (LinkMedia); Inria Rennes – Bretagne Atlantique; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SIGNAL, IMAGE ET LANGAGE (IRISA-D6); Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes); Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA); Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique); Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT); Kyutai; PMLR; ANR-20-CHIA-0011,SAIDA,Sécurité de l'Intelligence Artificielle pour des Applications Défense(2020)
- بيانات النشر:
  HAL CCSD
- الموضوع:
  2024
- Collection:
  Université de Rennes 1: Publications scientifiques (HAL)
- الموضوع:
  Vienna; Austria
- نبذة مختصرة :
  International audience ; In the rapidly evolving field of speech generative models, there is a pressing need to ensure audio authenticity against the risks of voice cloning. We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech. AudioSeal employs a generator / detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level, and a novel perceptual loss inspired by auditory masking, that enables AudioSeal to achieve better imperceptibility. AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics. Additionally, AudioSeal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed, achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications.Code is available at \href{https://github.com/facebookresearch/audioseal}{github.com/facebookresearch/audioseal}.
- Relation:
  info:eu-repo/semantics/altIdentifier/arxiv/2401.17264; ARXIV: 2401.17264
- الدخول الالكتروني :
  https://hal.science/hal-04610152
  https://hal.science/hal-04610152v1/document
  https://hal.science/hal-04610152v1/file/AudioWatermarking-2.pdf
- Rights:
  http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
- الرقم المعرف:
  edsbas.7366E9C0

تعليقات

No Comments.

Proactive Detection of Voice Cloning with Localized Watermarking

اتصل بنا

اتبع