Prior Art Search in Chemistry Patents Based On Semantic Concepts and Co-Citation Analysis
نویسندگان
چکیده
Prior Art Search is a task of querying and retrieving the patents in order to uncover any knowledge existing prior to the inventor’s question or invention at hand. For addressing this task, we present a contemporary approach that has been evaluated during Trecchem for its ability to adapt to text containing chemistry-based information. The core of the framework is an index of 1.3 million chemistry patents provided as a data set by Trecchem. For the prior art search task, the information of normalized noun phrases, biomedical and chemical entities are added to the full text index. Altogether, 7 runs were submitted for this task that were based on automatic querying with tokens, noun phrases and entities. In addition, the co-citation information was exploited in a systematic way to generate ranked citation sets from the retrieved documents. Querying with noun phrases and entities coupled with co-citation based post-processing performed considerably well with the best MAP score of 0.23.
منابع مشابه
Patent Retrieval in Chemistry Based on Semantically Tagged Named Entities
This paper reports on the work that has been conducted by Fraunhofer SCAI for Trec Chemistry (Trec-Chem) track 2009. The team of Fraunhofer SCAI participated in two tasks, namely Technology Survey and Prior Art Search. The core of the framework is an index of 1.2 million chemical patents provided as a data set by Trec. For the technology survey, three runs were submitted based on semantic dicti...
متن کاملExploring Keyphrase Extraction and IPC Classification Vectors for Prior Art Search
In this paper we describe experiments conducted for CLEFIP 2011 Prior Art Retrieval track. We examined the impact of 1) using key phrase extraction to generate queries from input patent and 2) the use of citation network and (International Patent Classification) IPC class vector in ranking patents. Variations of a popular key phrase extraction technique were explored for extracting and scoring ...
متن کاملFormulating Simple Structured Queries Using Temporal and Distributional Cues in Patents
Patent prior art retrieval aims to find related publications, especially patents, which may invalidate the patent. The task exhibits its own characteristic because of the possible use of a whole patent as a query. This work focuses on the use of date fields and content fields of the query patent to formulate effective structured queries. Retrieval is performed on the collection of patents which...
متن کاملQuery Enhancement for Patent Prior-Art-Search Based on Keyterm Dependency Relations and Semantic Tags
Prior art search is one of the most common forms of patent search, whose goal is to find patent documents that constitute prior art for a given patent being examined. Current patent search systems are mostly keyword-based, and due to the unique characteristics of patents and their usage, such as embedded structure and the length of patent documents, there are rooms for further improvements. In ...
متن کاملDeveloping Semantic Search for the Patent Domain
The patent domain is a very important source of scientific information that is currently not used to its full potential. Issues such as high numbers of patents, complicated language style and inconsistently used vocabulary make the task of searching for relevant patents extremely complex. While this is already a problem for patent professionals who have to invest a lot of time and effort into t...
متن کامل