Lexical Frequency and Linguistic Variation

1 Lexical Frequency and Linguistic VariationGregory R. Gu...
Author: Nathan Cain
0 downloads 0 Views

1 Lexical Frequency and Linguistic VariationGregory R. Guy Pennsylvania State University 8 November 2013

2 Issues, order of presentationTheories and models: Bybee, Pierrehumbert; Exemplar Theory vs. conventional phonology Does frequency significantly affect phonology? Are frequency effects continuous, discrete, linear? Are frequency effects independent and orthogonal to other constraints, or do they interact? Does frequency affect morphological variation? Does frequency affect syntactic variation? Negative evidence – when does frequency fail?

3 Other problems The data problem. Zipfs law curve.Few tokens of low frequency items, by def. Theoretical issues: how to incorporate freq in abstract representation, independence of operations, entrenchment vs change Exemplars- remembering the whole cloud Anti-frequency effects of generalization

4 The data problem: Zipf’s LawLexical frequency follows a power law distribution: there are very few words with very high frequencies, and many words with very low frequencies. e.g., in the million word Brown corpus, the 135 most common words account for half the data Hence, for statistical analysis of individual low frequency items, you need to process LOTS of data

5 Zipf’s law: Text frequency in Moby Dick

6 Part I. Does lexical frequency affect phonological processes?

7 Theoretical claims: Exemplar Theory, Usage-based PhonologyBybee, Pierrehumbert, Hay, etc. Words have very rich representations, incorporating information about frequency and contexts of occurrence Frequency postulated to condition variation, drive certain phonological changes (e.g., lenition)

8 Bybee on lexical frequency effects (Bybee 2000)“The more a word is used, the more it is exposed to the reductive effect of articulatory automation…” “Sound change affects stored representations incrementally each time a word is used…”

9 Phonological variables involving “reduction”English: final coronal stop (t/d) deletion: variable in all dialects constrained by linguistic context (preceding and following context, morphology) {t,d} -> Ø/ C__## e.g. ol’ man, eas’ side

10 Spanish: final /s/ lenition (‘aspiration and deletion’)Articulation: reduction in extent and duration of lingual constriction Acoustic properties: lowered frequency distribution, shorter duration variable in some dialects (e.g. Caribbean, Argentine, Andalucian) [s] -> {h,Ø}/ __## e.g. menos~menoh~meno

11 Does frequency have significant effects on variation?Yes! e.g., studies of English -t,d deletion: Myers & Guy 1997 Guy, Hay & Walker 2008 And No! e.g.: Erker 2008 (Caribbean Spanish -s lenition)

12 Is frequency significant? Yes -t,d deletion (Myers & Guy 1997)Monomorphemic words* N Deletions % Del Low frequency High frequency *p < .01 Obs: a lenition process, promoted by higher lexical frequency, per Bybee.

13 Guy, Hay & Walker 2008: t,d deletion. p=.0005

14 Is frequency significant? No! -t,d deletion (Myers & Guy 1997)Regular Past Tense Verbs N Deletions % Del Low frequency High frequency chi-sq (1df) = .073, p > .70

15 Is frequency significant? No! Spanish -s lenition (Erker 2008)Spectral center of gravity N= r = p = .74 Duration* N= r = p = .136 *n.s. overall, but see below

16 Spanish -s lenition: Spectral center of gravity by frequencySpanish -s lenition: Spectral center of gravity by frequency. (Erker 2008) p=.74

17 What’s the right measure of frequency?Published counts for large corpora: Myers & Guy: Francis & Kucera -- significant Guy, Hay, Walker: CELEX -- not significant Erker 2008: Davies -- not significant Local corpus counts: Guy, Hay, Walker: ONZE corpus, significant Erker 2008: Interview frequency, not significant

18 The right stuff: Lexical frequency in the ONZE corpus (Guy, Hay & Walker, 2008)“Log frequency of word, as produced by these speakers, has a significant effect on -t,d deletion (p=.0005). Higher frequency words are more reduced, [per Hooper 1976, Bybee 2000, etc.]” “Substituting Log CELEX frequency yields a non-significant effect (p=.13). Frequencies from CELEX, based on data from a different dialect of English collected many years later, are not significant predictors of deletion, even though they correlate well with local frequency measures.”

19 Sometimes there is no right measure Spanish -s lenition (Erker 2008)Spectral center of gravity Davies freq: r = p = .740 IV freq: r = p = .606 Duration Davies freq: r = p = .136 IV freq: r = p = .244

20 What kind of effect? Continuous, log-linear (GHW)Discrete (Myers & Guy) (data set partitioned at K&F count of 35 pmw) Threshhold effect? (Erker, duration)

21 Continuous, linear effect (Guy, Hay & Walker 2008)

22 Discrete effect: data partitioned into high and low freqDiscrete effect: data partitioned into high and low freq. sets (Myers & Guy) -t,d deletion: Monomorphemic words* N Deletions % Del Low frequency High frequency *p < .01

23 A threshhold effect: Spanish -s lenition: duration (Erker 2008)In a continuous treatment, lexical frequency is not significantly correlated with duration: r = p = .136 (Duration is not significantly shorter in more frequent words) But in a discretely partioned data set, frequency is significant. (High frequency words are shorter than low frequency words)

24 Spanish -s lenition: Duration by lexical frequency. (Erker 2008)

25 Threshhold effect: -s lenition (Erker) Frequency significant in discretely partioned data setHigh frequency (rank < 250) N= Mean duration = 41ms Low frequency (rank > 250) N= Mean duration = 55ms t-test: p=.002

26 Do frequency effects extend beyond phonology?Bybee’s focus is mainly phonological; her mechanism for advancing lenition by frequent repetition applies to articulation. But other evidence suggests frequency effects at other levels (e.g. morphology, syntax, lexical semantics)

27 Part II. Does frequency affect morphology?Does lexical frequency interact with morphological constraints on phonological processes? Does it affect morphological variation?

28 Are frequency effects constant, orthogonal to morphological constraints?Bybee’s model of lenition fed by frequency is phonologically motivated It suggests systematic preference for lenition/deletion in higher frequency forms This should be independent of and orthogonal to morphological constraints (also others, e.g. syntax, discourse)

29 But, other models make other predictions, e.g…Pinker (inter alia) argues that regular derived forms are generated by rule; only roots and irregular forms are stored Hence, frequency effects on regular derived forms shouldn’t occur, because they have no independent lexical representations to accumulate exemplars or collocational information

30 Frequency interacts with morphology: -t,d deletion (Myers & Guy)

31 Fruehwald on lexical frequency and -t,d deletionMultivariate analysis of -t,d deletion in the Buckeye corpus, including a treatment of frequency Like Myers & Guy, this study shows interaction b/w frequency and morphological class: the morphology effect is weak or neutralized at low frequencies, manifest at high frequencies.

32 -t,d morphology: Fruehwald(probabilities of /t,d/ retention)

33 Does frequency affect morphological variation?Morphological variation: choices among morphological alternants in a language LaFave and Guy (2011) look at frequency constraints on adjective gradation in English

34 LaFave & Guy: Adjective gradation in EnglishEnglish has two morphological options for making comparative and superlative adjectives: Synthetic: great, greater, greatest happy, happier, happiest Analytic: great, more great, most great happy, more happy, most happy (nb: trisyllabic or longer roots have only analytic forms: *importantest)

35 Adjective gradation is socially and linguistically constrainedLower status speakers use more synthetic/inflected forms Shorter words favor synthetic forms Does lexical frequency have an effect on this choice?

36 High Frequency Roots (mono- and di-syllables)early 16 large 14 small 14 cheap 13 long 12 low 12 great 11 late 10 good 185 old 51 big 41 easy 41 high 37 young 23 close 19 hard 17 Fourteen of these seventeen stems are monosyllabic, two (easy and early) are disyllabic and only one (important) is trisyllabic. Only the trisyllabic stem has a prescriptively analytic form.

37 Constraints on adjective gradation (Goldvarb factor weights)Probability of producing Synthetic/Inflected variant Degree Comparative Superlative N=527 N=248 .420 .665 Number of Syllables One Two N=589 N=186 .640 .139 Frequency High Low N=516 N=259 .863 .025 Education High School (or less) Undergraduate Graduate Unknown N=79 N=486 N=96 N=114 .654 .565 .357 .256 knockouts for education, #syllables Less than high school collapsed to High School (no analytic forms) 3, 4, 5 syllable words collapsed to 2+ sylls (no synthetic forms) N.S. FGs: Channel (Style), Age, Sex, Dialect Region, Ethnicity input = 0.992

38 Morphology and lexical frequency: conclusionsMorphological constraints on phonology interact with lexical frequency: derived forms are less affected by frequency Lexical frequency can affect choice among morphological alternatives: higher frequency favors selection of marked alternatives; lower frequency favors use of most general alternative

39 Part III. Does frequency affect syntactic variationCan lexical frequency affect syntactic variation, i.e. selection among syntactic alternatives? A test case: Spanish pro-drop Spanish is a pro-drop language; subject pronoun expression is optional

40 Variable pro-drop in SpanishSpanish speakers alternate between overt subject personal pronouns (SPPs) and zeros: El habla español vs. (Ø) Habla español. Yo vengo mañana vs. (Ø) Vengo mañana. This variable is well-studied; known to be constrained by many linguistic factors; some dialectal differences

41 Constraints on Spanish pro-dropProperties of verb Paradigm regularity Tense/Mood/Aspect Person/number Verbal semantics Discourse properties Switch reference vs. continuity of reference

42 The research question -- Erker & Guy 2012If properties of the verb (like its person/number form or paradigmatic regularity) determine whether or it has an overt subject pronoun, then…

43 The research question -- Erker & Guy 2012If properties of the verb (like its person/number form or paradigmatic regularity) determine whether or it has an overt subject pronoun, then… Does the lexical frequency of the verb also affect pro-drop?

44 Frequency effect on Spanish SPPsIndividual verbs could be stored in the exemplar cloud with collocational information about whether the associated subject pronoun was expressed or not Results: in 4k+ verbs from Otheguy/ Zentella NYC Spanish corpus, a main effect of frequency is found… but….

45 Erker & Guy, Spanish subject pronouns: Frequency and morphological regularity__Regular verbs __Irregular verbs

46 Frequency and tense-mood-aspect

47 Frequency and verbal semantics

48 Frequency and switch-reference

49 Constraint Infrequent Forms Frequent FormsMain Effect Interaction w/ Frequency Constraint Infrequent Forms Frequent Forms Regularity No (p = .73) Yes (p = .001) Yes (p = .001) Semantic Cl. No (p = .38) Yes (p = .001) Yes (p = .001) Person/Num. No (p = .47) Yes (p = .001) Yes (p = .001) TMA Yes (p = .001) Yes (p = .006) Near sig. (p<.08) Switch Ref. Yes (p = .001) Yes (p = .001) Near sig. (p<.06)

50 Interactions: summaryIn all cases, a constraint effect is greater in more frequent forms In some cases, the constraint effect is neutralized (insignificant) in infrequent forms The effect of higher frequency is not constant: in some contexts, higher frequency favors more overt SPPs, in others there are fewer SPPs.

51 Interactions: summaryConjecture: frequency interacts strongly with features local to the lexical item: -t,d deletion: morphology, derivational history Spanish pronouns: regularity, semantics Frequency interacts weakly with constraints external to the lexical item: Discourse level: switch reference Paradigmatic: tense/mood/aspect

52 How does frequency operate?Hypothesis: Speakers require some minimal level of exposure to a lexical item to formulate hypotheses about it, or to identify patterns it participates in. Hence, high frequency lexical items can be associated with collocational information (e.g. whether a verb co-occurs with an overt subject pronoun)

53 Corollary: InteractionSpeakers can formulate analogical generalizations across higher frequency forms; e.g. regarding paradigmatic regularity, person/number forms, etc. But lower frequency forms get the ‘plain vanilla’ treatment, with SPPs inserted at the unmarked average rate, undifferentiated by lexical identity or structural properties

54 Part IV. Some negative evidence: When frequency (and exemplar models) fail

55 Jesse goes to AustraliaAmer Eng speaking child moved to Oz at the age of 1yr10.5mos Evidence from input: Aus Eng has regular aspirated /t/ intervocalically where Amer Eng has a flap: e.g., water, little, pretty [lIrl] vs. [lIthl]

56 Jesse’s output after 10 weeks in Australian daycareAll intervocalic post-tonic obstruents become voiceless! Coronal stops: water, pretty, but also: daddy>datty, cuddle>cuttle Noncoronal stops: doggie>dockie, table>tapu, bobble>bapu, baby buggy>bapy bucky Fricatives: fuzzy bear>fussy bear, driver, driving> drifer, drifing

57 Note about the evidenceNo exemplars in target dialect for devoicing other than /t/ tokens No exemplars in native dialect for any devoicing! Massive counterevidence in both native and target dialect AGAINST his rule Frequency goes massively the wrong way No explanation by linguistic immaturity, markedness

58 Jesse’s devoicing ruleC --> [-voice]/ V ___ V [-son] [-stress]

59 Repairing the overshootJesse pares away incorrect devoicings by (abstract) natural classes: fricatives, velar stops, labial stops Persistent difficulty identifying which Am Eng flaps matched Aus Eng voiceless stops (e.g., continued use of ‘datty’), despite frequent counterexamples

60 Conclusion Jesse’s evidence suggests he has abstract underlying representations, one per lexical item, and performs abstract phonological operations on these URs. This evidence is inconsistent with exemplar theory, i.e., a word-by-word frequency driven model

61 Morphological acquisition in coronal stop deletionGuy & Boyd find age-graded treatments of semi-weak past tense forms in coronal stop deletion (e.g., left, kept, told) Most favorable category for deletion in young children Deleted at rate of monomorphemes for adolescents, young adults In middle age, most speakers move to a conservative treatment, suggesting a ‘derivational’ morphological analysis

62

63 Therefore, children are unable to match parental inputs, regardless of frequencyIf at a given stage of acquisition, children cannot formulate ‘derivational’ morphological structures, they will deviate from adult usage no matter how frequently they hear such forms

64 -t,d deletion probabilities for Curt and Kay C., King of Prussiafist, hand, cold lost, kept, told tossed, rolled

65 Probability matching of David C., 7 years old, King of Prussia PAfist, hand, cold lost, kept, told tossed, rolled

66 Probability matching of 16 children, 3-5 years old, So. PhiladelphiaProbability matching of 16 children, 3-5 years old, So. Philadelphia. Source: Figure 7.4, Roberts 1994 fist, hand, cold lost, kept, told tossed, rolled

67 General Conclusions Frequency has some significant effects, in morphosyntax as well as phonology But sometimes it has no effect, or fails completely It’s not a general and unitary phenomenon; shows interactions Threshhold effects: speakers need some minimum number of instances of a word to formulate lexically-linked constraints, collocations, etc. Otherwise they treat lexical items the same

68 Therefore… (Some) lexical representations must incorporate (some) frequency info Conventional abstract lexical representations (e.g. generative phonology) are too impoverished to adequately account for freq. info. Richer representations are needed, including information about collocations

69 But do the facts require full exemplar clouds in memory?This may be overkill: Fine frequency gradations are not usually evident in the data Exemplar/frequentist models sometimes make completely wrong predictions Abstract representations and generalized operations are required

70 How frequency effects workWhere frequency effects are evident, high frequency appears to favor: lenition over retention marked over unmarked lexically specific over general ‘exceptions’ over ‘rules’ In other words, lexical frequency resembles the ‘elsewhere’ condition

71 If you have a lot of information about a word, from hearing and using it often, you can formulate lexically-specific hypotheses on that word, and store them in your mental representation Failing that, you treat words using general processes. High lexical frequency ENABLES but does not REQUIRE specific linguistic outcomes.

72 So, Sometimes the frequency magic works, And sometimes it doesn’t, It’s only part of the story.Thank you, Thank you… Frequently