Multiword Expression Filtering for Building Knowledge Maps

Publication Date

January 2004

Document Type

Article

Publication Title

ACL Workshop on Multiword Expressions: Integrating Processing

Abstract

This paper describes an algorithm that can be used to improve the quality of multiword expressions extracted from documents. We measure multiword expression quality by the "usefulness" of a multiword expression in helping ontologists build knowledge maps that allow users to search a large document corpus. Our stopword based algorithm takes n-grams extracted from documents, and cleans them up to make them more suitable for building knowledge maps. Running our algorithm on large corpora of documents has shown that it helps to increase the percentage of useful terms from 40% to 70% --- with an eight-fold improvement observed in some cases.

Keywords

multiword expression, knowledge maps

Comments

SJSU users: use the following link to login and access the article via SJSU databases

Share

COinS