Common evaluations have grown to be a major component of all the ARPA Human Language Technology programs. In the written language community, the largest evaluation program has been the series of Message Understanding Conferences, which began in 1987 [2,3]. These evaluations have focussed on the task of analyzing text and automatically filling templates describing certain classes of events. Thes...