Stanford InfoLab Publication Server

Learning syntactic patterns for automatic hypernym discovery

Snow, Rion and Jurafsky, Daniel and Ng, Andrew Y. (2004) Learning syntactic patterns for automatic hypernym discovery. In: Advances in Neural Information Processing Systems (NIPS 2004), December 13-18, 2004,, Vancouver, British Columbia.




We present a new algorithm for learning hypernym (is-a) relations from text, a key problem in machine learning for natural language understanding. This method generalizes earlier work that relied on hand-built lexico-syntactic patterns by introducing a general-purpose formalization of the pattern space based on syntactic dependency paths. We learn these paths automatically by taking hypernym/hyponym word pairs from WordNet, finding sentences containing these words in a large parsed corpus, and automatically extracting these paths. These paths are then used as features in a high-dimensional representation of noun relationships. We use a logistic regression classifier based on these features for the task of corpus-based hypernym pair identification. Our classifier is shown to outperform previous pattern-based methods for identifying hypernym pairs (using WordNet as a gold standard), and is shown to outperform those methods as well as WordNet on an independent test set.

Item Type:Conference or Workshop Item (Paper)
Additional Information:This is a draft version from the NIPS preproceedings; the final version will be published by April 2005.
Subjects:Computer Science > Data Mining
Computer Science > Semistructured Data
Related URLs:Project Homepage
ID Code:665
Deposited By:Import Account
Deposited On:20 Nov 2004 16:00
Last Modified:23 Dec 2008 09:47

Download statistics

Repository Staff Only: item control page