نتایج جستجو برای: wikipedia mining
تعداد نتایج: 92181 فیلتر نتایج به سال:
This work proposes and evaluates a novel approach to determine interesting category for ranked lists using ν-SVM. We identify three characteristics (features), entropy, unlikability, and peculiarity and show how to train a classifier on these features using a set of Wikipedia tables. The learned model is evaluated by relevance assessments obtained through a user study, reflecting the correctnes...
The reuse of clinical data for the research environment is becoming one of the important tasks in medical informatics. The automatic assignment of the medical codes to the pre-identified concepts is turning to the Sisyphean task. For the MedNLP task in NTCIR-12 a new approach to automatically enrich the dictionary using online data is proposed. We have developed a text-mining system able to tre...
Extracting the semantic relatedness of terms is an important topic in several areas, including data mining, information retrieval and web recommendation. This paper presents an approach for computing the semantic relatedness of terns in RDF graphs based on the notion of proximity. It proposes a formal definition of proximity in terms of the set paths connecting two concept nodes, and an algorit...
This paper addresses the problem of the unsupervised classification of text-centric XML documents. In the context of the INEX mining track 2006, we present methods to exploit the inherent structural information of XML documents in the document clustering process. Using the k-means algorithm, we have experimented with a couple of feature sets, to discover that a promising direction is to use str...
This paper introduces a neural model for concept-to-text generation that scales to large, rich domains. It generates biographical sentences from fact tables on a new dataset of biographies from Wikipedia. This set is an order of magnitude larger than existing resources with over 700k samples and a 400k vocabulary. Our model builds on conditional neural language models for text generation. To de...
Community-generated text corpora can be a valuable resource to extract consumer health vocabulary (CHV) and link them to professional terminologies and alternative variants. In this research, we propose a pattern-based text-mining approach to identify pairs of CHV and professional terms from Wikipedia, a large text corpus created and maintained by the community. A novel measure, leveraging the ...
BACKGROUND With the advent of Web 2.0 technologies, user-edited online resources such as Wikipedia are increasingly tapped for information. However, there is little research on the quality of health information found in Wikipedia. OBJECTIVE To compare the scope, completeness, and accuracy of drug information in Wikipedia with that of a free, online, traditionally edited database (Medscape Dru...
Today’s education has been shaped by the rapid development of digital technologies and easy accessibility to a large number electronic sources. This instigated genuine need change current teaching attitudes practices. Wikipedia, as multilingual online platform open all, is source information that used on daily basis therefore cannot be ignored in process. The aim this study explore possibilitie...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید