Information Extraction for Call for Paper

نویسندگان

  • Laurent Issertial
  • Hiroshi Tsuji
چکیده

This paper proposes a system called CFP Manager specialized on IT field and designed to ease the process of searching conference suitable to one’s need. At present, the handling of CFP faces two problems: for emails, the huge quantity of CFP received can be easily skimmed through. For websites, the reviewing of some of the main CFP aggregators available online points out the lack of usable criteria. This system proposes to answer to these problems via its architecture consisting of three components: firstly an Information Extraction module extracting relevant information (as date, location, etc...) from CFP using rule based text mining algorithm. The second component enriches the now extracted data with external one from ontology models. Finally the last one displays the said data and allows the end user to perform complex queries on the CFP dataset and thus allow him to only access to CFP suitable for him. In order to validate the authors’ proposal, they eventually process the well-known precision / recall metric on our information extraction component with an average of 0.95 for precision and 0.91 for recall on three different 100 CFP dataset. This paper finally discusses the validity of our approach by confronting our system for different queries with two systems already available online (WikiCFP and IEEE Conference Search) and basic text searching approach standing for searching in an email box. On a 100 CFP dataset with the wide variety of usable data and the possibility to perform complex queries we surpass basic text searching method and WikiCFP by not returning the false positive usually returned by them and find a result close to the IEEE system. Information Extraction for Call for Paper

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

An Operator Assisted Call Routing System

A system to assist call routing task for telephone operators at the Directorate General of Telecommunications (DGT) in Taiwan is reported in this paper. The system was developed based on DGT organization profile with description of its six divisions instead of a corpus of recorded and transcribed call-routing dialogs. An acoustic module and an information retrieval module were built specificall...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

A Two-Threshold Guard Channel Scheme for Minimizing Blocking Probability in Communication Networks

In this paper, we consider the call admission problem in cellular network with two classes of voice users. In the first part of paper, we introduce a two-threshold guard channel policy and study its limiting behavior under the stationary traffic. Then we give an algorithm for finding the optimal number of guard channels. In the second part of this paper, we give an algorithm, which minimizes th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJKSS

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2015