Integration of context-dependent durational knowledge into hmm-based speech recognition

Xue Wang, Louis F. M. ten Bosch & Louis C. W. Pols

(Postscript (225k) and RTF (167k) versions are available)


This paper presents research on integrating context-dependent durational knowledge into HMM-based speech recognition. The first part of the paper presents work on obtaining relations between the parameters of the context-free HMMs and their durational behaviour, in preparation for the context-dependent durational modelling presented in the second part. Duration integration is realised via rescoring in the post-processing step of our N-best monophone recogniser. We use the multi-speaker TIMIT database for our analyses.

  1. introduction
  2. dpdf of standard hmm
    1. Obtaining the dpdf of the whole HMM
    2. Analysis of whole-model dpdf
  3. ml-training constrained with durational statistics
  4. analysis of context-dependent durational statistics
  5. integration of CD-duration models in post-processing
    1. Word-juncture modelling
    2. Duration score
    3. Re-scoring
  6. discussion