Wordnets: semantic networks of words and concepts
and their application in natural language processing
Piek Vossen
Contact
Faculteit der Letteren
Vrije Universiteit van Amsterdam
De Boelelaan 1105
1081 HV Amsterdam
Room 11A26
Tel. +31 (0)20 5986457
Fax. +31 (0)20 5986500
piek.vossen@vu.nl
http://www.vossen.info/
Course level
Intermediate
Course description
This course will provide background information on wordnets and discuss many issues involved in building wordnets, comparing wordnets and using them in NLP applications. The participants will try to build a small wordnet themselves and compare them with the English wordnet and more formal models such as ontologies.
The English WordNet was built as an implementation of a model of the mental lexicon. It separates the lexicalization of concepts in a language from the conceptual relations between concepts. It has been used extensively in natural language processing (NLP) even though it was not designed for that purpose. The wordnet model was applied to many more languages since and the multilinguality raised many new issues on what is a word, what is a concept and what lexical ambiguities languages exhibit.
Day-to-day program
Monday
The English WordNet: language as a conceptual model
Tuesday
EuroWordNet: multilingual perspective on wordnets
Wednesday
Global WordNet: word and concept
Thursday
Wordnet applications: detecting word meanings in context
Friday
Building your own wordnet: comparing and discussing domain wordnets
Reading materials
Background and preparatory readings
Websites:
English WordNet: http://wordnet.princeton.edu/wordnet/
EuroWordNet: http://www.illc.uva.nl/EuroWordNet/
Global Wordnet:http://www.globalwordnet.org/
Cornetto: http://www2.let.vu.nl/oz/cornetto/index.html
Books:
Fellbaum (ed.): WordNet An Electronic Lexical Database, MIT press, ISBN-10: 0-262-06197-X, ISBN-13: 978-0-262-06197-1
Vossen, P (ed.) 1998 EuroWordNet: a multilingual database with lexical semantic networks for European Languages. Kluwer, Dordrecht.
Course readings
Lecture 1:
You can download the literature for the first lecture from:
http://wordnetcode.princeton.edu/5papers.pdf You need to read the first four chapters (61 pages):
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller, Introduction to WordNet: An On-line Lexical Database, In: 5 papers on WordNet, p. 1-9
George A. Miller, Nouns in WordNet: A Lexical Inheritance System, In: 5 papers on WordNet, p. 10-25
Christiane Fellbaum, Derek Gross & Katherine Miller, Adjectives in WordNet, In: 5 papers on WordNet, p. 11-39
Christiane Fellbaum, English Verbs as a Semantic Net, In: 5 papers on WordNet, p. 40-61
Lecture 2:
Vossen P. 2002
EuroWordNet General Document. EuroWordNet Project LE2-4003 & LE4-8328 report, University of Amsterdam.
To be downloaded from: http://www.vossen.info/docs/2002/EWNGeneral.pdf
You only need to read pages 1-45
Lecture 3:
Vossen P. "WordNet: principles, developments and applications", chapter in Handbook of Linguistics and Communications (HSK), volume Dictionaries, an International Encyclopedia of Lexicography, Supplementary volume: Recent developments with special focus on computational lexicography. Eds. R.H. Gouws, U. Heid, W. Schweickard, H.E. Wiegand, Mouton de Gruyter, Berlin, Germany. To be downloaded from: http://kyoto.let.vu.nl/~vossen/113VossenWordNet3.pdf
Lecture 4:
Agirre, E. and Edmonds, P. (eds.) (2006): Word Sense Disambiguation: Algorithms and Applications. New York. Only the introduction chapter, page 1-28.To be downloaded from: http://kyoto.let.vu.nl/~vossen/Chapter1.v16.pdf
Vossen, P., A. Gorog, F. Laan, M. Van Gompel, R. Izquierdo, A. van den Bosch
"DutchSemCor: building a semantically annotated corpus for Dutch", in: Proceedings of Electronic Lexicography in the 21st century: New Applications for new users (eLEX2011), Iztok Kosem,Karmen Kosem (Eds.), Publ. Trojina Institute for Applied Slovene Studies, ISBN 9789619298336, p. 286-296, Bled, Slovenia, November 10-12, 2011. To be downloaded from: http://kyoto.let.vu.nl/~vossen/DutchSemCor.pdf
Lecture 5:
No readings