Community based question answer detection
نویسنده
چکیده
Each day, millions of people ask questions and search for answers on the World Wide Web. Due to this, the Internet has grown to a world wide database of questions and answers, accessible to almost everyone. Since this database is so huge, it is hard to find out whether a question has been answered or even asked before. As a consequence, users are asking the same questions again and again, producing a vicious circle of new content which hides the important information. One platform for questions and answers are Web forums, also known as discussion boards. They present discussions as item streams where each item contains the contribution of one author. These contributions contain questions and answers in human readable form. People use search engines to search for information on such platforms. However, current search engines are neither optimized to highlight individual questions and answers nor to show which questions are asked often and which ones are already answered. In order to close this gap, this thesis introduces the Effingo system. The Effingo system is intended to extract forums from around the Web and find question and answer items. It also needs to link equal questions and aggregate associated answers. That way it is possible to find out whether a question has been asked before and whether it has already been answered. Based on these information it is possible to derive the most urgent questions from the system, to determine which ones are new and which ones are discussed and answered frequently. As a result, users are prevented from creating useless discussions, thus reducing the server load and information overload for further searches. The first research area explored by this thesis is forum data extraction. The results from this area are intended be used to create a database of forum posts as large as possible. Furthermore, it uses question-answer detection in order to find out which forum items are questions and which ones are answers and, finally, topic detection to aggregate questions on the same topic as well as discover duplicate answers. These areas are either extended by Effingo, using forum specific features such as the user graph, forum item relations and forum link structure, or adapted as a means to cope with the specific problems created by user generated content. Such problems arise from poorly written and very short texts as well as from hidden or distributed information.
منابع مشابه
Non-factoid Question Answering Experiments at NTCIR-6: Towards Answer Type Detection for Realworld Questions
In this paper, we investigate the answer type detection methods for realizing the Universal Question Answering (UQA), which returns an answer for any given question. For this purpose, the questions collected from a WWW question portal community site were analyzed to see how many kinds of questions were submitted in the real world. Then, we introduce the approach for UQA and proposed two methods...
متن کاملThe patterns and behaviors of researchers’ knowledge sharing in scientific social networks:A Case Study of Research Gate’ Question And Answer System
Aim: Scientific social networks were shaped as part of a set of social software and a platform for international interactions sharing the tangible and intangible knowledge of researchers. The purpose is to investigate the patterns and behaviors of knowledge sharing of researchers in Research Gate. Based on this, the question and answer system of this scientific social network was analyzed and r...
متن کاملCommunity Detection Based on Social Network Analysis in Question and Answer Systems ⋆
A community-based question and answer (CQA) system is an integrated Internet platform for users to share knowledge, and it becomes very popular in recent years. In this paper, we study social network structures of CQA systems based on the dataset collected from “Baidu Knows”, which is the largest CQA system in China. By exploiting the question-answer interactions among the users, we construct t...
متن کاملAn Unsupervised Approach for Low-Quality Answer Detection in Community Question-Answering
Community Question Answering (CQA) sites such as Yahoo! Answers provide rich knowledge for people to access. However, the quality of answers posted to CQA sites often varies a lot from precise and useful ones to irrelevant and useless ones. Hence, automatic detection of low-quality answers will help the site managers efficiently organize the accumulated knowledge and provide high-quality conten...
متن کاملDetecting high-quality posts in community question answering sites
Community question answering (CQA) has become a new paradigm for seeking and sharing information. In CQA sites, users can ask and answer questions, and provide feedback (e.g., by voting or commenting) to these questions/answers. In this article, we propose the early detection of high-quality CQA questions/answers. Such detection can help discover a high-impact question that would be widely reco...
متن کاملTowards Predicting the Best Answers in Community-based Question-Answering Services
Community-based question-answering (CQA) services contribute to solving many difficult questions we have. For each question in such services, one best answer can be designated, among all answers, often by the asker. However, many questions on typical CQA sites are left without a best answer even if when good candidates are available. In this paper, we attempt to address the problem of predictin...
متن کامل