Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Diwan, Anuj; Yeh, Ching-Feng; Hsu, Wei-Ning; Tomasello, Paden; Choi, Eunsol; Harwath, David; Mohamed, Abdelrahman
الموضوع:
Electrical Engineering and Systems Science - Audio and Speech Processing; Computer Science - Computation and Language; Computer Science - Machine Learning; Computer Science - Sound
نوع التسجيلة:
Working Paper
الدخول الالكتروني :
http://arxiv.org/abs/2212.01393

معلومة اضافية
- الموضوع:
  2022
- Collection:
  Computer Science
- نبذة مختصرة :
  Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world scenario, we introduce LibriContinual, a continual learning benchmark for speaker-specific domain adaptation derived from LibriVox audiobooks, with data corresponding to 118 individual speakers and 6 train splits per speaker of different sizes. Additionally, current speech recognition models and continual learning algorithms are not optimized to be compute-efficient. We adapt a general-purpose training algorithm NetAug for ASR and create a novel Conformer variant called the DisConformer (Disentangled Conformer). This algorithm produces ASR models consisting of a frozen 'core' network for general-purpose use and several tunable 'augment' networks for speaker-specific tuning. Using such models, we propose a novel compute-efficient continual learning algorithm called DisentangledCL. Our experiments show that the DisConformer models significantly outperform baselines on general ASR i.e. LibriSpeech (15.58% rel. WER on test-other). On speaker-specific LibriContinual they significantly outperform trainable-parameter-matched baselines (by 20.65% rel. WER on test) and even match fully finetuned baselines in some settings.
  Comment: 8 pages, 2 figures. Submitted to ICASSP 2023
- الرقم المعرف:
  10.1109/ICASSP49357.2023.10095484
- الرقم المعرف:
  edsarx.2212.01393

تعليقات

No Comments.

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

اتصل بنا

اتبع