Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Comparing phoneme recognition systems on the detection and diagnosis of reading mistakes for young children's oral reading evaluation

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA); Institut de recherche en informatique de Toulouse (IRIT); Université Toulouse Capitole (UT Capitole); Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J); Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3); Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP); Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI); Université Toulouse - Jean Jaurès (UT2J); Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3); Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole); Université de Toulouse (UT); Lalilo, Paris
    • بيانات النشر:
      HAL CCSD
      International Speech Communication Association
    • الموضوع:
      2023
    • Collection:
      Archive ouverte HAL (Hyper Article en Ligne, CCSD - Centre pour la Communication Scientifique Directe)
    • الموضوع:
    • نبذة مختصرة :
      International audience ; In the scope of our oral reading exercise for 5-8-year-old children, models need to be able to precisely detect and diagnose reading mistakes, which remains a considerable challenge even for state-of-the-art ASR systems. In this paper, we compare hybrid and end-to-end acoustic models trained for phoneme recognition on young learners' speech. We evaluate them not only with phoneme error rates but through detailed phoneme-level misread detection and diagnostic metrics. We show that a traditional TDNNF-HMM model, despite a high PER, is the best at detecting reading mistakes (F1-score 72.6%), but at the cost of low precision (73.8%) and specificity (74.7%), which is pedagogically critical. A recent Transformer+CTC model, to which we applied our synthetic reading mistakes augmentation method, obtains the highest precision (81.8%) and specificity (86.3%), as well as the highest correct diagnosis rate (70.7%), showing it is the best fit for our application.
    • Relation:
      hal-04190328; https://hal.science/hal-04190328; https://hal.science/hal-04190328/document; https://hal.science/hal-04190328/file/Comparing%20phoneme%20recognition%20systems%20on%20the%20detection%20and%20diagnosis%20of%20reading%20mistakes%20for%20young%20childrens%20oral%20reading%20evaluation.pdf
    • الرقم المعرف:
      10.21437/slate.2023-2
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.CA48F473