A new model for linguistic summarization of heterogeneous data: an application to tourism web data sources

نویسندگان

  • Ramón Alberto Carrasco
  • Pedro Villar
چکیده

In this paper we present the problem of aggregating heterogeneous data from various websites with opinions about high end hotels into a database. We present the fuzzy model based on the semantic translation as a tool to obtain a linguistic summarization. The characteristics of this model (necessary to solve the problem) are not together on any of the existing linguistic models: the management of the input heterogeneous data (natural language included); the procurement of linguistic results with high precision and good interpretability; and the use of unbalanced linguistic term sets described by trapezoidal membership functions for defining the initial linguistic terms. We applied it to aggregate data from certain high end hotels websites and we show a case study using the high end hotels located in Granada (Spain) from such websites during a year. With this aggregated information, a data analyst can make several analyses with the benefit of easy linguistic interpretability and a high precision. The solution proposed here can be used to similar aggregation problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Homogeneous EDI between Heterogeneous Web-Based Tourism Information Systems

During the last years the tourism industry realized the potential of Web-based tourism information systems (TIS) to increase the competitiveness by providing individual and specialized information about tourism objects. This leaded to a broad spectrum of tourism information systems distributed over various Web sites. But the described situation is not really satisfying for the users of such sys...

متن کامل

ارائه سیستم خلاصه ساز متون فارسی برمبنای ویژگی های زبان شناختی و رگرسیون

Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document Summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence,...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

Anomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism

Today, the use of the Internet and Internet sites has been an integrated part of the people’s lives, and most activities and important data are in the Internet websites. Thus, attempts to intrude into these websites have grown exponentially. Intrusion detection systems (IDS) of web attacks are an approach to protect users. But, these systems are suffering from such drawbacks as low accuracy in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Soft Comput.

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2012