Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Acoustic-to-articulatory inversion by analysis-by-synthesis using cepstral coefficients

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • Contributors:
      Analysis, perception and recognition of speech (PAROLE); Inria Nancy - Grand Est; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD); Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA); Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
    • بيانات النشر:
      HAL CCSD
    • الموضوع:
      2013
    • Collection:
      Université de Lorraine: HAL
    • الموضوع:
    • الموضوع:
      Montréal, Canada
    • نبذة مختصرة :
      International audience ; This paper deals with acoustic to articulatory inversion of speech by using an analysis by synthesis approach. We used old X-ray films of one speaker to (i) the develop a linear articulatory model presenting a small geometric mismatch with the subject's vocal tract mid sagittal images (ii) and design an adaptation procedure of cepstral vectors used as input data. The adaptation exploits the bilinear transform to warp the frequency scale in order to compensate for deviation between synthetic and natural speech. This enables the comparison of natural speech against synthetic speech without using cepstral liftering. A codebook is used to represent the forward articulatory to acoustic mapping and we designed a loose matching algorithm using spectral peaks to access it. This algorithm, based on dynamic programming, allows some peaks in either synthetic spectra (stored in the codebook) or natural spectra (to be inverted) to be omitted. Quadratic programming is used to improve the acoustic proximity near each good candidate found during codebook exploration. The inversion has been tested on speech signals corresponding to the X-ray films. It achieves a very good geometric precision of 1.5 mm over the whole tongue shape unlike similar works evaluating the error at 3 or 4 points corresponding to sensors located at the front of the tongue.
    • Relation:
      hal-00836808; https://inria.hal.science/hal-00836808; https://inria.hal.science/hal-00836808/document; https://inria.hal.science/hal-00836808/file/inversioWithAbstract.pdf
    • الدخول الالكتروني :
      https://inria.hal.science/hal-00836808
      https://inria.hal.science/hal-00836808/document
      https://inria.hal.science/hal-00836808/file/inversioWithAbstract.pdf
    • Rights:
      info:eu-repo/semantics/OpenAccess
    • الرقم المعرف:
      edsbas.79A8758B