Item request has been placed!
×
Item request cannot be made.
×

Processing Request
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition
Item request has been placed!
×
Item request cannot be made.
×

Processing Request
- معلومة اضافية
- Publisher Information:
2025-02-03 2025-03-05
- نبذة مختصرة :
Modern deep learning models often achieve high overall performance, but consistently fail on specific subgroups. Group distributionally robust optimization (group DRO) addresses this problem by minimizing the worst-group loss, but it fails when group losses misrepresent performance differences between groups. This is common in domains like speech, where the widely used connectionist temporal classification (CTC) loss scales with input length and varies with linguistic and acoustic properties, leading to spurious differences between group losses. We present CTC-DRO, which addresses the shortcomings of the group DRO objective by smoothing the group weight update to prevent overemphasis on consistently high-loss groups, while using input length-matched batching to mitigate CTC's scaling issues. We evaluate CTC-DRO on the task of multilingual automatic speech recognition (ASR) across five language sets from the ML-SUPERB 2.0 benchmark. CTC-DRO consistently outperforms group DRO and CTC-based baseline models, reducing the worst-language error by up to 47.1% and the average error by up to 32.9%. CTC-DRO can be applied to ASR with minimal computational costs, and offers the potential for reducing group disparities in other domains with similar challenges.
- الموضوع:
- Availability:
Open access content. Open access content
- Other Numbers:
COO oai:arXiv.org:2502.01777
1504922156
- Contributing Source:
CORNELL UNIV
From OAIster®, provided by the OCLC Cooperative.
- الرقم المعرف:
edsoai.on1504922156
HoldingsOnline
No Comments.