1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Ogun, Sewade; Owodunni, Abraham T.; Olatunji, Tobi; Alese, Eniola; Oladimeji, Babatunde; Afonja, Tejumade; Olaleye, Kayode; Etori, Naome A.; Adewumi, Tosin
المصدر:
Interspeech 2024 ; https://hal.science/hal-04663033 ; Interspeech 2024, Sep 2024, Kos Island, Greece
الموضوع:
text-to-speech; African-accented TTS; accented speech; multi-accent TTS; multi-speaker TTS; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [INFO]Computer Science [cs]; [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
نوع التسجيلة:
conference object
اللغة:
English

معلومة اضافية
- Contributors:
  Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH); Inria Nancy - Grand Est; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD); Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS); Luleå University of Technology = Luleå Tekniska Universitet (LUT)
- بيانات النشر:
  HAL CCSD
- الموضوع:
  2024
- Collection:
  Université de Lorraine: HAL
- الموضوع:
  Kos Island; Greece
- نبذة مختصرة :
  Accepted at Interspeech 2024 ; International audience ; Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African voices and personas are under-represented in these systems. As speech synthesis becomes increasingly democratized, it is desirable to increase the representation of African English accents. We present Afro-TTS, the first pan-African accented English speech synthesis system able to generate speech in 86 African accents, with 1000 personas representing the rich phonological diversity across the continent for downstream application in Education, Public Health, and Automated Content Creation. Speaker interpolation retains naturalness and accentedness, enabling the creation of new voices.
- Relation:
  info:eu-repo/semantics/altIdentifier/arxiv/2406.11727; hal-04663033; https://hal.science/hal-04663033; https://hal.science/hal-04663033/document; https://hal.science/hal-04663033/file/1000_African_Voices___Interspeech_2024.pdf; ARXIV: 2406.11727
- Rights:
  http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
- الرقم المعرف:
  edsbas.1DAB836F

تعليقات

No Comments.

1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

اتصل بنا

اتبع