Update Legal Documents Using Hierarchical Ranking Models and Word Clustering
نویسندگان
چکیده
Our research addresses the task of updating legal documents when new information emerges. In this paper, we employ a hierarchical ranking model to the task of updating legal documents. Word clustering features are incorporated to the ranking models to exploit semantic relations between words. Experimental results on legal data built from the United States Code show that the hierarchical ranking model with word clustering outperforms baseline methods using Vector Space Model, and word cluster-based features are effective features for the task.
منابع مشابه
Order-theoretical ranking
Current best-match ranking (BMR) systems perform well but cannot handle word mismatch between a query and a document. The best known alternative ranking method, hierarchical clustering-based ranking (HCR), seems to be more robust than BMR with respect to this problem, but it is hampered by theoretical and practical limitations. We present an approach to document ranking that explicitly addresse...
متن کاملA hierarchical model for clustering
We propose a new hierarchical generative model for textual data, where words may be generated by topic speciic distributions at any level in the hierarchy. This model is naturally well-suited to clustering documents in preset or automatically generated hierarchies, as well as categorising new documents in an existing hierarchy. Training algorithms are derived for both cases, and illustrated on ...
متن کاملClustering multilingual documents by estimating text - to - text semantic relatedness
This thesis is about multilingual document clustering through estimating semantic relatedness between multilingual texts. Specifically we focus on the task of clustering multilingual documents with very limited or no supervisory information. We present two approaches to address the problem : a comparable-corpora based approach and a web-searches based approach. Our first approach derives pairwi...
متن کاملیک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجرههای همپوشان
A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...
متن کاملIntellectual structure of knowledge in Nanomedicine field (2009 to 2018): A Co-Word Analysis
Introduction: The Co-word analysis has the ability to identify the intellectual structure of knowledge in a research domain and reveal its subsurface research aspects. Objective: This study examines the intellectual structure of knowledge in the field of nanomedicine during the period of 2009 to 2018 by using Co-word analysis. Materials and Methods: This paper develops a sciento...
متن کامل