Hints from the Crowd: A Novel NoSQL Database

نویسندگان

  • Paolo Fosci
  • Giuseppe Psaila
  • Marcello Di Stefano
چکیده

The crowd can be an incredible source of information. In particular, this is true for reviews about products of any kind, freely provided by customers through specialized web sites. In other words, they are social knowledge, that can be exploited by other customers. The Hints From the Crowd (HFC) prototype, presented in this paper, is a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions (the reviews). In this paper, we mainly describe the query engine, and we show that our prototype obtains good performance in terms of execution time, demonstrating that our approach is feasible. The IMDb dataset, that includes more than 2 million reviews for more than 100,000 movies, is used to evaluate performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Querying NoSQL-based Crowdsourcing Systems Efficiently

In this paper, we provide a novel approach for effectively and efficiently support query processing tasks in novel NoSQL crowdsourcing systems. The idea of our method is to exploit the social knowledge available from reviews about products of any kind, freely provided by customers through specialized web sites. We thus define a NoSQL database system for large collections of product reviews, whe...

متن کامل

PAX: Partition-Aware Autoscaling for the Cassandra NoSQL Database

Apache Cassandra has emerged as one of the most widely adopted NoSQL databases. However, there is still a limited understanding on how to optimally operate Cassandra in the cloud using autoscaling methods, by which resources can be scaled up or down to reduce operational costs and meet servicelevel objectives (SLOs). To address this limitation, we present PAX, a partition-aware elastic resource...

متن کامل

Database Design for NoSQL Systems

The popularity of NoSQL database systems is rapidly increasing, especially to support nextgeneration web applications. However, given the high heterogeneity existing in this world, where more than fifty systems are available, database design is usually based on best practices and guidelines which are strictly related to the selected system. We propose a database design methodology for NoSQL sys...

متن کامل

NOSQL Design for Analytical Workloads: Variability Matters

Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical...

متن کامل

Renormalization of NoSQL Database Schemas

NoSQL applications often use denormalized databases in order to meet performance goals, but this introduces complications. In particular, application evolution may demand changes in the underlying database, which may in turn require further application revisions. The NoSQL DBMS itself can do little to aid in this process, as it has no understanding of application-level denormalization. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013