ACOUSTICAL CHARACTERISTICS OF SOUND PRODUCTION OF DEAF AND NORMALLY HEARING INFANTS

Chris J. Clement, Florien J. Koopmans-van Beinum and Louis C.W. Pols

Institute of Phonetic Sciences & IFOTT, University of Amsterdam
Herengracht 338, 1016 CG Amsterdam, The Netherlands
email: clement@fon.let.uva.nl

ABSTRACT

Several recent studies have shown that speech production develops in an organized way, already in the first twelve months of life. This development is determined by several factors such as anatomical growth and physiological constraints. Studying the sound production of deaf infants and comparing this with that of normally hearing infants, can give more insight into the role of auditory speech perception on sound production. So far, no systematic work has been reported on the development of sound production of deaf infants in the first months of life. The present study is intended to address this topic in a systematic and controlled way. Preliminary results indicate differences in sound production between deaf and normally hearing infants, for instance with respect to utterance duration, even within the first half year of life. These findings strongly suggest that already in this early stage of speech development sound production is not solely determined by anatomical and physical constraints, but also by auditory perception and feedback. These results may contribute to a better understanding of the current models of early speech acquisition.

1. INTRODUCTION

Some recent studies suggest a deviant speech production of hearing impaired compared to normally hearing infants in the first year of age [e.g. 3]. No canonical babbling was found in deaf infants before the age of eleven months while most hearing infants start babbling before that age [6]. In several studies differences were observed in consonantal features and phonetic repertoire size [e.g. 9].

Until now - to our knowledge - no systematic study has been performed on acoustical characteristics, such as duration and F0, of the sound production of deaf infants starting within the first half year of life. Most studies [e.g. 3] investigate infants systematically not earlier than at the end of the first year of life. Kent et al. studied fundamental frequency and formant frequencies of the sound productions of an eight months old deaf infant and compared these with those of his hearing twin brother. A higher peak F0 and a slightly larger range of peak F0 values of the deaf infant compared to his hearing brother were found. More variation within the utterances like strong changes within the F0 contour and intervals of vocal fry , were observed as well.

The present study reports on longitudinal data of six deaf and six normally hearing infants between 2.5 (or somewhat older) and 11.5 months of age. The main question we address is: do hearing impaired infants differ from normally hearing infants with respect to duration and F0?

2. METHOD

2.1. Subjects

Twelve mother-infant pairs participated in this study: six infants profoundly hearing impaired (group HI) and six matched infants with normal hearing (group NH). No clear health problems, like cognitive or motor delays, were found in using developmental tests. The HI infants had an average hearing loss of 90 dB or more at the best ear, established by Auditory Brainstem Response audiometry (ABR) in the first six months of life. This profound hearing loss was confirmed by several pure-tone audiometric tests at later ages. Hearing aids were regularly used by three subjects within the period studied.

The NH infants were matched with the HI infants on the following criteria: sex, birth order, duration of pregnancy, age of the mother, socio-economical status of the parents, and dialect of the parents. All NH infants were recorded from the age of 2.5 months onwards, two HI infants from the age of 2.5 months, one from 3.5 months and three from the age of 5.5 months onwards.

2.2. Analysis Procedures

Audio recordings, lasting about half an hour each, were made every two weeks. Of every monthly audio recording, the first 10 minutes were transcribed. Two trained phoneticians performed and verified the transcriptions. The inter-judge agreement based on all material (108 recordings) was 93 % for the infant utterances. An infant utterance was defined as a sound production during one breath cycle starting with inhalation. Laughing, crying and vegetative sounds were not taken into account. The number of infant utterances during the first 10 minutes were counted (see [2] for more details about the subjects and procedure of analyses).

Fifty infant utterances per recording were selected evenly from the transcribed ten minutes. All 5400 (108 times 50) utterances were digitized with a sample frequency of 48 kHz and stored for further analysis. The duration was measured in ms if possible on positive zero-crossings.

F0 measurements were performed with an autocorrelation algorithm [1]. This algorithm proved to be more accurate than the commonly used methods for speech analysis when tested with signals with additive noise or jitter. However, the deviant phonation of infants compared to adults caused specific problems with F0 measurements, for instance because of the high F0 (on average 361 Hz for NH infants and 382 Hz for HI infants). An extremely high F0 often caused interference with the first formant. For instance 190 utterances turned out to have a maximum F0 of over 1000 Hz and 17 utterances were found to have a maximum F0 of even over 2000 Hz. Another example of peculiar phonation was an extremely low F0 because of vocal fry (creaky voice) specially found within the HI group. Also irregular periodicity caused by the use of false vocal cords was found. Of the total of 5400 utterances 266 utterances were judged to be unvoiced, but in many cases a periodicity in the signal was found by the algorithm. This was mainly caused by periodic velar trill sounds in case of the HI infants and by periodic bilabial or tongue trill sounds in case of the NH infants. These problems caused that measurement errors were made in 36 % of the utterances. A solution was found in checking each utterance by comparing the original soundwave and the synthesized F0 contour auditory and visually. If necessary the F0 contour was corrected.


3. RESULTS

3.1. Utterance duration

In figure 1 the mean utterance duration of the 50 selected utterances is presented per month, as well as the average duration over the ten months for both groups. It can be observed that the mean utterance duration for the ten months combined is somewhat longer for the HI infants (940 ms, sd=752 ms) than for the NH infants (915 ms, sd=758 ms) although not significant.

An analysis of variance shows a significant effect for age and for the interaction group*age (p<.00001). A Tukey post-hoc test indicates that this interaction effect is caused by the long utterance duration at 3.5 months for the NH infants. The mean duration (see figure 2) at 3.5 months of the NH infants is considerably longer than that of the HI infants and at any other age in the studied period (p<.00005). Tukey post-hoc test on the data at 3.5 months shows a significant difference (p<.01) between the three HI infants and each of the NH infants except one NH infant who produces the peak in duration at 4.5 months.


Figure 1: Mean utterance duration and standard deviations of the 50 selected utterances for the HI and the NH group per month, as well as the mean duration for the 10 months combined. (N is 300 at each age of the NH infants, except at 9.5 and 10.5 months when N = 250). N is 100, 150 and 150 at 2.5, 3.5 and 4.5 months resp. and 300 at 5.5 to 11.5 months in the case of the HI infants.)


Figure 2: Mean utterance duration and standard deviations of the 50 utterances for the HI and the NH group at 3.5 months. N=50 for each infant.


3.2. Fundamental frequency

In figure 3 the average median fundamental frequency of the 50 selected utterances is presented per month, as well as the average fundamental frequency over the ten months for both groups. It can be observed that the average for the ten months combined is somewhat higher for the HI infants (382 Hz, sd=170 Hz) than for the NH infants (362 Hz, sd=152 Hz). The factors group, age and the interaction group*age all turn out to be significant (p<.0001). A Tukey post-hoc test shows a significantly lower F0 at the age of 2.5 months for both groups compared to any other age (p<.005), except 3.5 month. A higher F0 of the HI infants compared to the NH can be found mainly above 8.5 months and turns out to be significant at 9.5, 10.5, and 11.5 months (p<.01, p<005 and p<.05). A Tukey post-hoc test on the data of those months shows that this effect is due mainly to two HI infants with an extremely high median F0.


Figure 3: Average median F0 and standard deviations of the 50 selected utterances for the HI and the NH group per month, as well as the average median F0 for the 10 months combined. (N: see figure 1)



Figure 4: Mean number of voiceless utterances out of the 50 selected utterances for the HI and the NH group per month, as well as the mean number of utterances for the 10 months combined. (N=6 at each age of the NH infants, except at 9.5 and 10.5 months when N=5. N=2, 3 and 3 at 2.5, 3.5 and 4.5 months resp. and 6 at 5.5 to 11.5 months in the case of the HI infants.)


When the data over all months are combined the mean peak (= maximal) F0 turns out to be significantly higher (p<.00001) in case of the HI infants (459 Hz, sd=264 Hz) compared to the NH infants (430 Hz, sd=234 Hz). However, in none of the separate months the differences between the groups become significant.

To study F0 variation within the utterances, the range and standard deviation of each utterance is measured. Over all months combined, both measurements show significantly more F0 variation within utterances produced by the HI infants compared to the NH infants (range: p<.0005, standard deviation: p<.005). The mean range within the utterance of the HI infants is 161 Hz (sd=229 Hz) and of the NH infants 137 Hz (sd=196 Hz) and the mean standard deviation within the utterance of the HI infants is 46 Hz (sd=72 Hz) and of the NH infants 41 Hz (sd=62 Hz). In none of the separate months a significant difference between the two groups is found.

Despite the significant differences between the groups (when data over all months are combined) for median F0, peak F0 and F0 variation, it would be incorrect to conclude that all HI infants produce a higher median and maximal F0, and more variation within the utterances compared to all NH infants. Between 7.5 and 11.5 months we found in most months not only the subject with,

on average, the highest mean and peak F0 and most variation in the HI group, but also the subject with, on average, the lowest mean and peak F0 and the least variation. Also, when studying the range and standard deviation of the 50 utterances per month it can be seen that in most cases both the subjects with, on average, the highest as well as the lowest range and standard deviation can be found in the HI group.

3.3. Number of Unvoiced Utterances

In figure 4 the mean number of voiceless utterances out of 50 utterances is presented per month, as well as averaged over the ten months for both groups. It can be observed that the average percentage for the ten months combined is higher for the HI infants (6.8 %) than for the NH infants (2.4 %). The differences start to appear at the age of 8.5 months and turn out to be significant (p <.025) according to a Mann-Whitney U test at the ages of 9.5, 10.5, and 11.5 months combined. Most of the unvoiced utterances of the HI infants are produced as velar fricative or trill sounds, while this is not the case for the NH infants.

4. DISCUSSION

In summary, it seems that already within the period investigated, (between 2.5 and 11.5 months of age) several differences in the speech production between HI and NH infants can be observed. The differences become more clear from about 8.5 months onwards, with respect to utterance duration, median and maximal F0, F0 variation, and voiceless sound production. This may be due to lack of auditory feedback on the speech production from that age. In the first months fewer differences between the two groups are observed. This may suggest a stronger influence of biologically determined factors (e.g. anatomical growth) rather than of auditory feedback on vocalizations in these first months compared to later periods.

We found a longer utterance duration for the NH infants at age 3.5 months compared to the HI infants (see figure 1 and 2). All NH infants produced this `peak' in the duration except one NH infant who made the peak a 4.5 months. It seems that infants can produce longer utterances after their third month of life, when their rib cage has restructured towards the adult configuration [4]. From that age NH infants can control the duration of their utterances by regulating their sub-glottal air pressure, as is shown by examples of imitation of the duration and pitch of mother utterances by a three months old infant [7]. According to Lieberman it might be the case that the probably innate propensity for sub-glottal air pressure and laryngeal muscles needs to be exercised within a critical period [5]. He suggests that the lack of exercising in this period might result in the extremely poor control of sub-glottal air pressure and of larynx muscles by older deaf children. The lack of the "duration peak" by the HI infants at about 3.5 months in our study seems to support this idea. After the 5th month, the HI infants produce on average somewhat longer utterances than the NH infants.

The deaf infant studied by Kent et al. [3] showed on average a higher peak F0 and a wider range in the peak F0 of his utterances, compared to his hearing twin brother. In our study we found on average a slightly higher median and peak F0 and more variation within the utterances in the sound production of HI infants compared to those of the NH infants. However, studying the individual subjects, we didn't find evidence for a clear distinction between the HI and NH subjects by means of F0. It can be concluded that the HI subjects showed more variation in their phonation than their hearing peers did. For instance, we found in the HI groups more examples of vocal fry and the use of false vocal cords (resulting in an extremely low and minimally variegated F0 within the utterance), screaming (with an extremely high and often markedly variegated F0 within the utterance), articulation without voicing, etc. So, we did not find evidence for the finding of Stark that the sound types which are characteristic of the "vocal play stage" (experimentation with squealing, growling, friction and other noises) are produced by HI infants to a limited extent only [8]. A possible explanation for this difference in results might be that Stark studied the utterances of HI infants from 15 months onwards.

5. CONCLUSION

It seems that already within the investigated period, i.e., between 2.5 and 11.5 months of age, several differences in the sound production between HI and NH infants can be observed with respect to utterance duration and F0. Our preliminary results suggest that a lack of auditory feedback influences the speech production already in this very early stage of development.

6. ACKNOWLEDGEMENTS

This study is financially supported by the Institute for the Deaf in St. Michielsgestel and the Institute for Functional Research of Language and Language Use of the University of Amsterdam.

7. REFERENCES