In this paper, we present a system for event recognition and classification in video surveillance sequences. First, local invariant descriptors of video frames are employed to remove background information and segment the video into events. Next, visual word histograms are computed for each video event and used to define a distance measure between events. Finally, machine learning techniques ar...