1 Associative Anaphora Resolution: A Web-Based ApproachBy Razvan Bunescu, University of Texas at Austin Presenter: Thomas Rodenhausen
2 Problem: Definite descriptions in unrestricted text Intro Problem: Definite descriptions in unrestricted text Goal: Recover antecedents in types of anaphoric relations Identity anaphora “Fred was discussing an interesting book in his class. I went to discuss the book with him afterwards.” Associative anaphora “Bill found himself in the middle of a forest. The trees were tall and sturdy.” (Trigger, association) pairs.
3 Related Work Using WordNet to account for human knowledge when establishing relations. Semantic priming effect of (trigger, association) could be detected using lexical clustering algorithm. Lexico-syntactic patterns to mine lexical associative axioms from a corpus of articles This work uses lexico-syntatic patterns from the WWW.
4 Method [Verb] in {is/area, was/were, has/have, had, may, might, can, could, should, would} Altavista search engine utilized Intuition: Ordered pairs of nouns in identity or associative anaphora will score high.
5 Evaluation First 32 Docs from the Brown section of TreebankList of potential associates consisting of definite descriptions with a single noun No prepositional or relative phrase was attached to them (focus on def. descriptions most susceptible to be anaphoric) Exclude def. descriptions where head noun occurred before in Doc Resulting list: 686 def. descriptions Task: Identify trigger if exists All annotated by hand to belong in a class 1: Is anaphoric and has at least one trigger noun 2: Is anaphoric but trigger is not a single noun (verb or phrase) 3: Def. noun phrase can be linked to trigger by common knowledge (“the pope”, “the world”, “the past”) 4: Def. noun phrase triggered by discourse (“the problem”, “the question”) 5: Def. description not linkable (beginning of Doc) 6: Def. noun phrase inside an idiomatic phrase (“out of the blue”, “in the making”)
6 Evaluation Consider all 50 preceding nouns as triggerPick trigger for which highest degree of association Impose threshold to obtain precision recall graph Method evaluated on class 1 and 2 (trigger actually exists) Method designed only for class 1 Baselines Pick random trigger from 50 preceding nouns: Precision: 1.1%, Recall: 2.1% Pick closest preceding noun: Precision: 3.4%, Recall: 6.2%
7 Discussion Phrase pattern > NEAR PatternImportance to enforce asymmetry of relationship Importance that associates do not need to establish modifier or relative clauses Method compares well with Poesio et al, 1998: Precision/Recall: 22.7% Comparison complicated since test sets different Errors and Limitations and Future Work Sometimes second best trigger was correct: Distance to associate should matter Sometimes return of search engine was “inaccurate”, e.g. duplicate text in multiple documents On some examples the statistics “fail”. Example: “the Greek” corefers with “a member of a greek syndicate”. Sometimes window size of 50 nouns was not sufficient. Example “She was just another freighter from the States, … She was the John Harvey… John Havey… the ship” The association should propagate to all coreferent items Cases of coreference for which the association measure cannot help but can be solved by a simple approach. Example “the latter” construction Changes in discourse topic may invalidate strong associations between nouns from different topics Method can only be applied to single noun (trigger, associate) pairs To make it wider applicable need collocation detection Trigger candidate nouns could be boosted by WordNet knowledge: synonymy, hyponymy or meronymy
8 Resolving and Generating Definite Anaphora by Modeling Hypernymy using Unlabeled CorporaBy Nikesh Garera and David Yarowsky, John Hopkins University Presenter: Thomas Rodenhausen
9 Intro Task: (1) Resolving and (2) Generating definite anaphora (involving hyponyms) “...pseudoephedrine is found in an allergy treatment, which was given to Wilson by a doctor when he attended Blinn junior college in Houston. In a unanimous vote, the Norwegian sports confederation ruled that Wilson had not taken the drug to enhance his performance...” Pick the correct antecedent for “the drug”:
10 Intro Problems: Requires knowledge of hypernym/hyponym relationshipsRequires selection of the most appropriate level of generality Existing Hypernym databases (e.g. WordNet) are very incomplete non-existent for most of the languages Don’t provide the “natural” level for anaphora generation
11 Related Work Use WordNet as lexical and semantic resource for certain types of bridging anaphora WordNet as feature in superv.-ML of coreference resolution Corpus-based approaches to build a WordNet-like resources Applied corpus-based appr. to resolving different types of bridging anaphora Extract lexical knowledge about part-of relations using Hearst-style patterns to resolve bridging references Suggestion to use a search engine to compute lexical distance between antecedent and definite NP Extract relations from lexico-syntactic patterns X and other Ys for Other-anaphora and bridging involving meronymy Generally a lack of work to automatically build lexical resources for definite anaphora resolution involving hyponyms
12 Method Method: Unsupervised models for extracting hypernym relations by co-occurrence data in unlabeled corpus TheY Model: Observation: use of “the” indicates a already established concept (1) “He is taking a new drug for his high cholesterol. The drug is very expensive.” (2) “He is taking Lipitor for his high cholesterol. The drug is very expensive” Filtering case (1) is easy and remaining are likely instances of hypernymic definite anaphora Use unsupervised statistical co-occurrence modeling over the entire corpus to find NP “Lipor” as likely antecedent Frequency of
13 Method The WordNet Model TheY+WordNet ModelChoose X if it occurs as hyponym (direct or indirect inheritance) of Y. If multiple choose closest. TheY+WordNet Model Difficulty with WordNet only is Low coverage Poor ranking: Lack of empirical ranking model Thus re-rank possible antecedents retrieved from WN by TheY model Use TheY Model as a backoff scheme in case WN does not cover the case.
14 OtherY+WordNet Model…Method OtherY_freq Model Implementation of the corpus-based algorithm by Markert and Nissim (2005) for the equivalent task Lexico-syntactic pattern X and A* other B* Y{pl} for extracting (X, Y) pairs. A* and B* allow for adjectives or other modifiers to be placed Model uses raw frequency OtherY_MI Model Normalize OtherY using Mutual Information Score TheY+OtherY_MI Model Both make use of different linguistic phenomena, see if they are complementary. Similarly combined to TheY+WordNet Model OtherY for antecedent candidates TheY for reranking and backoff OtherY+WordNet Model…
15 Evaluation Results Gigaword corpus
16 Generation Task Generate “the drug” for antecedent “pseudophedrine”Challenges Multiple acceptable choices for definite anaphor; complicates evaluation Space of potential candidates is unbounded compared to the reverse task (NP in window size) Human Experiment Extract 103 (true antecedent, defeinite NP) pairs from corpus Let human choose a parent class that acts as good definite NP without context 79% agreement with corpus. Can be used as upper bound for the task. There appears to be a relatively context-independent concept of “natural” level in the hypernym hierarchy for generation anaphors E.g. <“alkaloid”, “organic compound”, “compound”, “substance”, “entity”> all hypernyms of “Pseudoephederine” in WordNet “the drug” the preferred hypernym. Others too specific or general Natural level difficult to define by rule
17 Generation Task Models trained basically in the same manner, differences Frequency statistics reversed to provide a hypernym given a hyponym WordNet model uses direct hypernym due to lack of better rule Combined corpus-based approaches and WordNet Corpus-basd approach looks up hypernym Y of X Only produces Y if Y also occurs in WordNet as hypernym Combination of TheY, OtherY and WordNet Use hypernym from OtherY first, if nothing found use TheY, then filter by WordNet Evaluate based on exact-match agreement with corpus Agreement with prediction by a human judge in absence of context