Formal Description of Slavic Languages

1 Formal Description of Slavic LanguagesLaurent Dumercy ...
Author: Solomon Murphy
0 downloads 0 Views

1 Formal Description of Slavic LanguagesLaurent Dumercy Université Nice Sophia Antipolis, CNRS 7320 Frédéric Lavigne Université Nice Sophia Antipolis, CNRS 7320 Tobias Scheer Université Nice Sophia Antipolis, CNRS 7320 Markéta Ziková Masaryk University, Brno Anything goes: Czech initial clusters run against evidence from a dichotic experiment Formal Description of Slavic Languages FDSL – 10 Leipzig, 5-7 December 2013

2 TR-only and anything-goes languagesChapter 1 TR-only and anything-goes languages

3 TR-only vs. anything-goes languagesTR-only languages T=obstruent, R=sonorant word-initial clusters are restricted to TR (plus s+C) English, French, German, Belarusian, Bulgarian etc. anything-goes languages any sonority profile occurs word-initially: TR, TT, RR, RT Moroccan Arabic, Berber, Russian, Czech, Polish etc.

4 TR-only vs. anything-goes languagesbut TR-only languages instantiate ALL logically possible muta-cum liquida clusters (except #tl, #dl) while anything-goes languages may instantiate ALL logically possible #CCs: Moroccan Arabic or may exhibit only a subset of these: Greek, all Slavic languages in point

5 #RT in Slavic #RT in Slavic: exhaustive record (Scheer 2007, 2012)Corpus:

6 #RT in Polish & Czech Polish selection: 20 out of 126 possible #RTs16% Czech selection: 28 out of 108 possible #RTs 26%

7 prediction prediction gaps are accidental, not systematic

8 prediction reason #1: empirical in Slavicanarchic distribution. Neither occurring nor non-occurring #CCs are natural classes or otherwise in complementary distribution modern anything-goes languages are merely CS minus yers (plus eventual repairs): #C1-yer-C2V… > #C1C2V… where C1 and C2 have random distribution in CS. ==> non-TRs are lexical accident Slavic languages may (Belarusian, Bulgarian, Slovak) or may not have reacted against non-#TRs.

9 prediction reason #2: theoretical the initial CV- carrier of the extra-phonological information "beginning of the word" - non-diacritic # - Lowenstamm (1999), Scheer (2012) the initial CV is binary: - present or absent

10 prediction reason #2: theoretical typological prediction- there are only two types of languages - TR-only and anything goes. Nothing in between.

11 prediction purpose to test the prediction through evidence from dichotic fusion

12 Chapter 2 Dichotic fusion

13 how it works subjects are exposed to two distinct stimuli through two distinct perceptive channels: - audio-visual (McGurk effect) - audio-audio (right + left ear) both stimuli are fused in perception: - McGurk: ga(V) + ba(A) ==> da (percept) - L+R ear: pay(L) + lay(R) ==> play (percept) don't trust your brain: the percept may be something that is absent from the stimulus [Kant: thing (Ding) vs. thing-in-itself (Ding an sich)] some McGurk literature: McGurk & MacDonald (1976), Massaro (1998), Ingleby & Ali (2003)

14 Cutting (1975) Cutting (1975): Aspects of Phonological Fusion.Journal of Experimental Psychology 104: Cutting wonders: why - banket + lanket ==> blanket not lbanket? - gab + lab ==> glab not lgab? - pay + lay ==> play not lpay? - bed + red ==> bread not rbed? but - tass + tack ==> task / tacks randomly (same subject) Day (1970)

15 Cutting (1975) Cutting's explanationthe phonology of English natives eliminates the competitor LBANKET because it has an ill-formed initial cluster. hence Czech, Polish or other natives from anything-goes languages should not have this bias: for them LBANKET and BLANKET should be equally well-formed.

16 fusion rate is variable, depending on speakers without any identifiable pattern. this is true for all varieties of dichotic fusion: audio-visual, audio-auditory etc. And also for other ways perception is fooled: visual illusion. different fusion rates for different conditions is used as evidence for mental structures: Ingleby & Ali (2003) report that for audi-visual stimuli, short vowels in English have a fusion rate of 67%, coda consonants of 60%, onset consonants of 48% and long vowels of 16%. Cutting reports relatively low fusion rates: - 61% for synthesized speech - 31% for natural speech but possible (plausible) experimental bias: he did free choice, i.e. "write down whatever you hear".

17 lead time lead time conditionCutting tests how far speakers can be driven by their phonology in ignoring the sensor input. offset lay (R) pay (L) 50ms time 50ms lead lay (R) offset pay (R) lay (L) 50ms time 50ms lead pay (R)

18 lead time Cutting's findings lead time zero, -50ms, +50msno effect on fusion rate: the percept of - pay (0) + lay (0) - pay (+50) + lay (0) - pay (0) + lay (+50) is play at the same proportion. this is true up to ±150ms, knowing that a single consonant, stop or liquid, was <90ms longer lead times: 200ms, 400ms, 800ms fusion rate decreases, but is not zero: 10% at ±800ms.

19 prediction prediction- if gaps are accidental, grammar does not object to non-existing #CCs and Czechs etc. should treat them just like existing #CCs. - if gaps are systematic, grammar objects to non-existing #CCs and Czechs etc. should treat both sets differently. hence, if gaps are accidental, Czechs should stubbornly follow lead time with both - p… + l… ==> pl… / lp… #lp exists in Czech - p… + r… ==> pr… / rp… #rp does not exist in Czech ==> whatever reaches the auditive system first should be first in the percept.

20 Two fundamental differences with Cutting's setupCutting only tests "fusible" items, i.e. those whose target cluster is well-formed in English: - pay + lay ==> play / lpay fusible: pl is well-formed - pay + day ==> pday / dpay non-fusible: neither #CC is ok Cutting only tests existing words: - both source words exist - one of the target words exist, the other does not ==> no nonce-words are tested

21 Two fundamental differences with Cutting's setupCutting is limited by English: - he cannot test stop+liquid clusters that are well-formed but do not occur: there are no (ALL logically possible TRs occur) when RTs are well-formed as in Slavic, - there are no non-fusible items: ALL items are fusible - there are RTs that do not occur (as opposed to TRs in English, which all occur): e.g. #rp in Czech.

22 Chapter 3 Experimental setup

23 conditions five conditions 1. source words: exist, don't exist2. target words I: word exists or not 3. target words II: #CC exists or not 4. type of cluster: TL, LT, RR, TT 5. lead time: simultaneous, 50ms source word 1, 50ms source word 2 Cond. source words - may both exist: máz - ráz ==> mráz / *rmáz - may both not exist: rousit - dousit ==> rdousit / *drousit

24 conditions Cond. target words I + II - the word exists or not- the #CC exists or not #CC word target 1 target 2 source 1 CC *CC yes 2 3 pr *rp no pak-rak prak *rpak 4 pl lp pes-les ples *lpes 5 kr *rk kýč-rýč *krýč *rkýč 6 tr rt touk-rouk *trouk *rtouk the major variation (the one that will play the most important role in the results) is binary: - target 2 never exists, but - target 1 may or may not exist

25 conditions Cond. #CCs source TL LT RR TT 1 vak-rak vrak *rvak 2source TL LT RR TT 1 vak-rak vrak *rvak 2 mít-lít mlít, *lmít 3 kepa-tepa *ktepa, *tkepa

26 conditions Cond. lead time - pes - les ==> ples / *lpes- offset simultaneous - 50ms lead pes - 50ms lead les

27 what we measure recall that in Cutting's setup there is never any competition between the two target words: the only pattern tested is - stop T + liquid L ==> TR… - fusion rate measured - because a #TR item can never compete with an #RT item in English in Slavic (Czech) it can. consequence: we are not measuring the same thing: rather than fusion rates, we evaluate the competition between the two target words, i.e. the likelihood of one being chosen over the other given 1. the lexical effect (word and #CC) 2. the lead time effect

28 initial test set TL LT RR TT source words target 1 (intended) target 2TL LT RR TT source words target 1 (intended) target 2 exist pr *rp lp pl mr rm tk kt rv vr mn *nm kv vk lv vl ml *lm db bd dl *ld rt tr tl *lt lh hl gd *dg kr *rk ln *nl *řd ld pk *kp šl pv vp rf fr *rn *nr kch *chk don't exist kl lk rd dr pt *tp cp *pc bl *lb tch cht *ňm dv vd žl rk tp rb br lch chl žh

29 resistance against fusionpre-test by a native: do you hear a cluster? simultaneous TL LT RR TT source words target 1 (intended) target 2 exist pr *rp lp pl mr rm tk kt rv vr mn *nm kv vk lv vl ml *lm db bd dl *ld rt tr tl *lt lh hl gd *dg kr *rk ln *nl *řd ld pk *kp šl pv vp rf fr *rn *nr kch *chk don't exist kl lk rd dr pt *tp cp *pc bl *lb tch cht *ňm dv vd žl rk tp rb br lch chl žh red: no. ==> echo effect this is regular: fusion rate 12 items lost input to the experiment: 54

30 resistance against fusionrecall Cutting's fusion rates: - 61% for synthesized speech - 31% for natural speech our own fusion rate depends dramatically on sound characteristics. Cutting talks about a "dirty signal". we have tested 3 sound types (of the same 66 tems), but only by one native speaker: - without normalization: fusion rate 45,5% (30 out of 66) - normalization 1 (proportional: stronger at 1, weaker remains weaker according to the input proportion): fusion rate 9% (6 out of 66) - normalization 2 (both stronger and weaker at 1, input difference leveled out): fusion rate 63,5% (42 out of 66)

31 resistance against fusionafter this pre-test, the input to our experiment was made of - all fusing items of norm.2 (42) - plus all missing slots that could be filled by the two other sound types (12) possible, in fact plausible experimental bias - Cutting does free choice: "write down whatever you hear" - we do forced choice (written on screen): target 1, target 2, none of these

32 experimental protocolparticipants (n=24) - were asked to look at a screen - to listen to a sound - upon sound onset, three choices appear on the screen - to the left: target word 1 (in Czech spelling) - to the right: target word 2 (in Czech spelling) - centred: "none of these" (in Czech spelling) - to press a button corresponding to the closest match of what they've heard other experimental conditions - participants were asked to respond as accurate as possible - no constraints on response time - stimulus cannot be repeated

33 Chapter 4 Results

34 lexical effect existing words when two target words compete whereby- one is an existing word - and the other is not ==> participants give preference to the existing target existing clusters when two target words compete whereby - word 1 may or may not exist - word 2 does not exist ==> participants choose word 2 more often if it has an existing cluster (as opposed to when it does not) interpretation the fact that X (word or cluster) is stored in long-term memory enhances its chances to be perceived.

35 lexical effect what are we probing?- for words we know that the effect is lexical - for existing vs. non-existing #CCs, it is either lexical or phonological. - difficult to tease apart, but we will try… recall that this is precisely what the prediction is about: grammar (i.e. phonology, *not* the lexicon) does not object to any #CC. interpretation - given the precedent of the word-lexical effect, our working hypothesis is that the cluster effect is also lexical. Hence: - the fact that X (word or cluster) is stored in long-term memory enhances its chances to be perceived.

36 lexical effect (word #1)plot 1 column: t-test plots: anova

37 lexical effect (word #2)plot 2 anova

38 lexical effect (cluster #1)plot 3 anova

39 lexical effect (cluster #2)plot 4 column: t-test plots: anova

40 lexical effect Cutting hints at this effect, but misinterprets it as a "semantic" bias. he reports that Day (1969) has found that "fusion rates were higher when the fused outcomes were real words than when they were nonwords" (p.112). and then designs an experiment where "semantic" cues are provided from a phrasal context: - The trumpeter ____s for us - to be filled in with pay+lay ==> quite unsuprisingly, Cutting found that fusion rate was significantly higher in the sentence condition than when words were fused in isolation.

41 lexical effect dramatic consequencesCutting thought he has probed a phonological bias (#lp is ill-formed phonologically), but in fact what he did probe was a lexical bias (lpay does not exist in the lexicon). the title of his paper is not really accurate: - Aspects of Phonological fusion - the fusion in itself is phonological - but the mechanism driving the fusion in his data is not.

42 lead time effect plot 5 anova

43 lead time effect: zoom on an eventual #CC effectplot 6 t-test tendency

44 conclusion Czechs are well behaved: when the lexical (word) effect is eliminated, - they they stubbornly follow lead time (plot 5) - irrespectively of whether the cluster exists or not (plot 6) ==> anything goes, all clusters have the same grammatical status Cutting's paper is called "Aspects of Phonological Fusion". Alas he missed out on the lexical effect: what he measured was the action of the lexicon, not of phonology. There is a phonological effect, but in order to emerge its lexical competitor needs to be eliminated. So we know that Czechs are welll behaved, but we've lost our control group: English natives. We don't know how they behave when exposed to nonce-words. Prediction: they will favour #CCs with rising sonority slope.

45 references Cutting, James E Aspects of Phonological Fusion. Journal of Experimental Psychology 104: Day, R. S Fusion in dichotic listening. Ph.D dissertation, Stanford University. Day, R. S Temporal-order perception of a reversible phoneme cluster. Journal of the Acoustical Society of America 48: 95. Ingleby, Michael & Ali Azra Phonological Primes and McGurk Fusion. Proceedings of the 15th International Congress of Phonetic Sciences: Lowenstamm, Jean The beginning of the word. Phonologica 1996, edited by John Rennison & Klaus Kühnhammer, La Hague: Holland Academic Graphics. Massaro, Dominic W Perceiving talking faces. From speech perception to a behavioural principle. Cambridge, Mass.: Bradford/MIT Press. McGurk, H. & J. MacDonald Hearing Lips and Seeing Voices. Nature 264: Scheer, Tobias On the Status of Word-Initial Clusters in Slavic (And Elsewhere). Annual Workshop on Formal Approaches to Slavic Linguistics. The Toronto Meeting 2006, edited by Richard Compton, Magdalena Goledzinowska & Ulyana Savchenko, Ann Arbor: Michigan Slavic Publications. WEB. Scheer, Tobias Direct Interface and One-Channel Translation. A Non-Diacritic Theory of the Morphosyntax-Phonology Interface. Vol.2 of A Lateral Theory of phonology. Berlin: de Gruyter.