Meng Yang Phonetics Seminar May 9, 2016

1 Meng Yang Phonetics Seminar May 9, 2016Cue shifting bia...
Author: Basil Hopkins
0 downloads 0 Views

1 Meng Yang Phonetics Seminar May 9, 2016Cue shifting bias between acoustically correlated cues (progress update) Meng Yang Phonetics Seminar May 9, 2016

2 Roadmap Review of relevant theories Questions and predictionsStimuli and experimental design Results update Discussion

3 One contrast, multiple cuesVoicing: VOT and vowel-initial f0 (e.g. Abramson & Lisker, 1985) Eng. Vowel tenseness: formant height and vowel duration (e.g. Hillenbrand et al., 2000) Obstruent place: formant transitions and energy distribution in release burst (e.g. Francis et al., 2000) Phonological contrasts are signaled by multiple acoustic phonetic cues …examples… Listeners pay attention to more than one cue because when we remove/mask the primary cue, they are still able to perform categorization tasks (presumably using what they know about secondary cues) Since listeners are paying attention to multiple cues, we can talk about how much they pay attention to each cue as differences in cue weights (can assign numbers to them)

4 Cue weighting & cue shiftingPrimary cue: attended to most, highest cue weight Secondary cues: attended to less, lower cue weights Cue shifting Response to changes in the signal

5 Factors affecting weight shiftLearning? Cue weights learned from distributional informativeness Changes in cue weights should be proportionate to changes in distribution Associative learning (e.g. Holt et al., 2001) Substantive? Perceptual enhancement facilitates learning Changes in cue weights constrained by auditory characteristics of cues Auditory enhancement (e.g. Kingston & Diehl, 1995) Intrinsic bias? Some cues are privileged Evidence from asymmetries in sound change

6 Through the lens of pitch and breathinessAcoustic measures Pitch: f0 Breathiness: H1-H2 Natural negative correlation Higher pitch = less breathy higher f0 = lower H1-H2 Lower pitch = more breathy lower f0 = higher H1-H2

7 Inducing cue shift Sound categorization taskStimuli drawn from tw0-dimensional acoustic space 2 stimulus sets with distributions that favour one of the two cues Listeners learn from a distribution that favours one cue, then change to a distribution that favours the other cue. To do well on the task, they have to shift cue weights to the most informative cue in the new distribution

8 Questions and predictionsHow much do cue weights shift? Learning only? Manipulation: cue distinctiveness Prediction: More distinctive cue should get higher weight Enhancement? Manipulation: natural vs. unnatural correlation between cues Prediction: Easier to shift weight to distinctive cue when cues are naturally correlated Intrinsic bias? Manipulation: order of presentation (B>P, P>B) Prediction: Easier to shift in one of these two conditions (directionality)

9 Methods – Stimuli distributionIn each set: 2 clusters of training stimuli (black dots) Roughly: Low pitch and less breathy tokens belong to one category Higher pitch and more breathy tokens belong to another category What’s different: informativeness of cues Distinctive Pitch Pitch dimension: little within category variance, no overlap between categories Breathiness dimension: a lot of within category variance, a lot of overlap between categories Pitch is a more reliable cue, breathiness is much less reliable Forcing listener to give higher weight to pitch Distinctive Breathiness Opposite (breathiness more informative, pitch less informative) Expect listeners to give higher weight to breathiness Scaling of each cue dimension: make sure they are perceptually equivalent JND x 10 Pitch 96-126Hz Range: 30Hz Breathiness dB Range: 36.7dB

10 Methods – Synthesized Stimulif0 and H1-H2 manipulations in Voice Synthesis (Antoñanzas-Barroso, Kreiman & Garrett, 2006) Normalization using script on Appsobabble.com (Tehrani, 2015) Spliced in Praat (Boersma & Weenink, 2015) Examples: High pitch Low pitch Breathy Modal

11 Presented using Appsobabble (Tehrani, 2015)Methods - Procedure Part 1 Part 2 TRAINING TRAINING TRAINING TEST TRAINING TEST 2 parts Part 1: 3 blocks of training with training tokens from one of two distributions  86 trials Test block: in addition to training tokens, also heard test tokens (white)  136 trials Part 2: 1 block of training with training tokens from the other distribution Test block: training + test tokens Time Presented using Appsobabble (Tehrani, 2015)

12 Presented using Appsobabble (Tehrani, 2015)Methods - Procedure  A / B Feedback Per trial Hear a sound Make a categorization decision Visual feedback Training tokens: right or wrong category Test tokens: blue triangle ~30 minutes per participant Presented using Appsobabble (Tehrani, 2015)

13 Conditions P-first B-first Naturally Correlated Unnaturally Correlated1 2 1 2 P: Part 1: distinctive pitch stimuli Part 2: distinctive breathiness stimuli B: Part 1: distinctive breathiness stimuli Part 2: distinctive pitch stimuli *in order to do well on Part 2, have to shift weight to the other cue Naturally correlated conditions: explain using pitch as an example Listener learn to use changes in pitch and ignore changes in breathiness to categorize sounds Listener is aware of the natural correlation between pitch and breathiness (high pitch = lower breathiness, low pitch = higher breathiness) Listeners will want to transfer this knowledge when exposed to new training stimuli in Part 2 In part 1, learned that higher pitch tokens are in category B Shifting weight onto breathiness, will want to label less breathy tokens as the same category Because higher pitch naturally corresponds with less breathiness By the same logic, we arrive at the same labels for the B group Unnaturally correlated: Same arbitrary category labels in Part 1 Part 2, switch the labels to make the correlation of cues between part 1 and part 2 unnatural B A B B B-first A B A A 1 2 1 2

14 Comparisons P-first B-first Naturally CorrelatedUnnaturally Correlated B A B B P-first A B A A 1 2 1 2 P: Part 1: distinctive pitch stimuli Part 2: distinctive breathiness stimuli B: Part 1: distinctive breathiness stimuli Part 2: distinctive pitch stimuli *in order to do well on Part 2, have to shift weight to the other cue Naturally correlated conditions: explain using pitch as an example Listener learn to use changes in pitch and ignore changes in breathiness to categorize sounds Listener is aware of the natural correlation between pitch and breathiness (high pitch = lower breathiness, low pitch = higher breathiness) Listeners will want to transfer this knowledge when exposed to new training stimuli in Part 2 In part 1, learned that higher pitch tokens are in category B Shifting weight onto breathiness, will want to label less breathy tokens as the same category Because higher pitch naturally corresponds with less breathiness By the same logic, we arrive at the same labels for the B group Unnaturally correlated: Same arbitrary category labels in Part 1 Part 2, switch the labels to make the correlation of cues between part 1 and part 2 unnatural B A B B B-first A B A A 1 2 1 2

15 Analysis Exclusion: Performance below chance on training stimuli from test block Cue weights 2 sets of cue weights from each participant, one set from each test block (Part 1, Part 2) Measure of how well dimension values (breathiness values and pitch values) predict likelihood of “A” responses across test trials Cue weights = coefficients from logit binomial regression with Breathiness and Pitch as factors Absolute values of the correlation coefficients were normalized to sum to one

16 Results P-first B-first Naturally Correlated Unnaturally CorrelatedYellow = non-distinctive cue Blue = distinctive cue n = 26 n = 22 B-first n = 25 n = 23

17 Part 1 Results . Check if initial learning is equal across groupsPlot description: Cue weight difference between the distinctive cue and the non-distinctive cue Greater cue weight difference = better learning Linear regression Trend for B-first to condition to be better learned Include part 1 learning as covariate in analysis of Part 2

18 Part 2 Results P-first B-first Estimate Std. Error p-value (Intercept)0.36 .20 .07 Part 1 Learning 0.00 .22 1.00 Order – P -0.07 .06 .25 Distinctiveness - Distinctive 0.28 <.001 Correlation - Unnatural 0.15 .01 Correlation × Distinctiveness -0.29 .08 Order × Distinctiveness 0.13 .09 P-first Main effect: distinctiveness Overall, distinctive cues are weighted higher Main effect: correlation Cue weights are higher in the unnatural condition: Non-distinctive cue is not down weighted as much in this condition compared to the natural condition Indicative of greater interference from Part 1 Interaction: Correlation*Distinctiveness Distinctive cues in the unnatural conditions were weighted lower Better learning in the natural conditions Interaction: Order*Distinctiveness (trend) Distinctive cues in the P-first condition were weighted higher Better learning in the P-first conditions NO interaction between Order*Distinctiveness*Correlation B-first

19 Results Summary Distributional Learning Phonetic EnhancementYes, change in distribution caused cues to shift Phonetic Enhancement Yes, cue shift was easier when cues were naturally correlated Directional Bias Probably. Cue shift tended to be easier when pitch was learned first, then breathiness Enhancement + Bias? Enhancement + Bias? Compounding effect of enhancement and directional bias seen in the unnaturally correlated B-first condition Huge amount of variation Means suggest that participants failed to switch cues entirely Did not come out statistically (no three-way interaction btw Distinctiveness*Order*Correlation

20 The Directional Bias P  B Easier B  P Harder A perceptual account:Pitch can be perceived as independent of breathiness Breathiness can’t be perceived as independent of pitch If you learn pitch first, then you don’t carry any perceptual baggage when learning breathiness in Part 2 If you learn breathiness first, then in Part 2, you must deal with conflicting information from Your knowledge of the natural correlation between the two cues The unnatural correlation in the stimuli

21 Where might the bias come from?Intrinsic? Contrast transfer common: VOT/phonation  pitch rare: pitch  VOT/phonation Contrast typology common: pitch only, pitch + phonation rare: phonation only VOT  pitch (Korean, Kammu, Cham, Punjabi, Vietnamese, Chinese) Pitch  phonation (Quiavini Zapotec)

22 Where does the bias come from?Language experience? Phonation Contrast Yes No Pitch Contrast Hmong Chinese Gujarati English

23 “Chinese” P-first B-first Naturally Correlated Unnaturally CorrelatedYellow = non-distinctive cue Blue = distinctive cue n = 9 n = 11 B-first n = 11 n = 11

24 Gujarati P-first B-first Unnaturally Correlated n = 3 n = 3Yellow = non-distinctive cue Blue = distinctive cue n = 3 B-first n = 3

25 Many thanks =] Committee: Pat Keating, Megha Sundara, Jody KreimanJody Kreiman and Norma Antoñanzas-Barroso for help with Voice Synthesis Lab Technician: Henry Tehrani Undergraduate RAs: Jordan Wright, Zhongshi Xu

26 References: Abramson, A. S., & Lisker, L. (1985). Relative power of cues: Fo shift versus voice timing. In V. Fromkin (Ed.) Phonetic linguistics: Essays in honor of Peter Ladefoged (pp ). New York: Academic Press. Bhatia, T. K. (1975). The evolution of tones in Punjabi. Francis, A. L. Baldwin, K., & Nusbaum, H. C. (2000). Effects of training on attention to acoustic cues. Perception Psychophysics 62: Garellek, M. & Keating, P. (2011). The acoustic consequences of phonation and tone interactions in Jalapa Mazatec. Journal of the International Phonetic Association 41(02); Garellek, M., Keating, P., Esposito, C. M., & Kreiman, J. (2013). Voice quality and tone identi_cation in White Hmong. The Journal of the Acoustical Society of America 133(2); Hillenbrand, J. M., Clark, M. J., and Houde, R. R. (2000). Some effects of duration on vowel recognition. Journal of the Acoustical Society of America 108: Holt, L. L., & Lotto, A. J., (2006). Cue weighting in auditory categorization: Implications for first and second language acquisition. Journal of the Acoustical Society of America 119(5): Holt, L. L., Lotto,A. J., and Kluender, K. R. (2001). “Influence of fundamental frequency on stop-consonant voicing perception: A case of learned covariation or auditory enhancement?,” Journal of the Acoustical Society of America 109, 764–774.

27 References: Jun, S-A. (1996). Influence of microprosody on macroprosody: A case of phrase-initial strengthening. University of California Los Angeles Working Papers in Phonetics 92; Kingston, J. (2011). Tonogenesis. Companion to Phonology. Malden, MA: Wiley-Blackwell, Kingston, J., and Diehl, R. (1995). ‘‘Intermediate properties in the perception of distinctive feature values,” in Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV, edited by B. Connell and A. Arvanti (Cambridge University Press, Cambridge, England), pp. 7–27. Kirk, Paul L., Peter Ladefoged & Jenny Ladefoged. (1984). Using a spectrograph for measures of phonation types in a natural language. UCLA Working Papers in Phonetics 59, 102–113. Kuang, J. (2011). Production and perception of the phonation contrast in Yi. Ms., University of California, Los Angeles. Silva, D. J. (2006). Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology 23; Thurgood, G. (1999). From ancient Cham to modern dialects: Two thousand years of language contact and change. Honolulu: University of Hawaii Press. Thurgood, G. (2002). Vietnamese and tonogensis. Diachronica 19; Uchihara, H. (2015). A case of anti-tonogenesis in Quiaviní Zapotec. Presented at Seminario de Lenguas Indigenas, Universidad nacional Autónoma de México.

28 Thank you