Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities
نویسندگان
چکیده
The current information analysis capabilities of legal professionals are still lagging behind the explosive growth in legal document availability through digital means, driving the need for higher efficiency Legal Information Retrieval (IR) and Question Answering (QA) methods. The IR task in particular has a set of unique challenges that invite the use of semantic motivated NLP techniques. In this work, a two-stage method for Legal Information Retrieval is proposed, combining lexical statistics and distributional sentence representations in the context of Competition on Legal Information Extraction/Entailment (COLIEE). The combination is done by means of disambiguation rules, applied over the lexical rankings when those deemed unreliable for a given query. Competition and experimental results indicate small gains in overall retrieval performance using the proposed approach. Additionally, a analysis of error and improvement cases is presented for a better understanding of the contributions.
منابع مشابه
Efficient Neural-based patent document segmentation with Term Order Probabilities
The internationally growing trend of patent applications puts great pressure on the agents involved in managing this kind of information and creates a demand for efficient and effective patent analysis methods. This work presents a computationally efficient approach for patent document segmentation based on structured ANNs and a simple distributional semantics composition method. The conducted ...
متن کاملImproving distributional thesauri by exploring the graph of neighbors
In this paper, we address the issue of building and improving a distributional thesaurus. We first show that existing tools from the information retrieval domain can be directly used in order to build a thesaurus with state-of-the-art performance. Secondly, we focus more specifically on improving the obtained thesaurus, seen as a graph of k-nearest neighbors. By exploiting information about the...
متن کاملSteganography Scheme Based on Reed-Muller Code with Improving Payload and Ability to Retrieval of Destroyed Data for Digital Images
In this paper, a new steganography scheme with high embedding payload and good visual quality is presented. Before embedding process, secret information is encoded as block using Reed-Muller error correction code. After data encoding and embedding into the low-order bits of host image, modulus function is used to increase visual quality of stego image. Since the proposed method is able to embed...
متن کاملExploring the neighbor graph to improve distributional thesauri (Explorer le graphe de voisinage pour améliorer les thésaurus distributionnels) [in French]
In this paper, we address the issue of building and improving a distributional thesaurus. We first show that existing tools from the information retrieval domain can be directly used in order to build a thesaurus with state-of-the-art performance. Secondly, we focus more specifically on improving the obtained thesaurus, seen as a graph of k-nearest neighbors. By exploiting information about the...
متن کاملDistributional Semantic Representation for Text Classification and Information Retrieval
The objective of this experiment is to validate the performance of the distributional semantic representation of text in the classification (Question Classification) task and the Information Retrieval task. Followed by the distributional representation, first level classification of the questions is performed and relevant tweets with respect to the given queries are retrieved. The distributiona...
متن کامل