نتایج جستجو برای: similarity score
تعداد نتایج: 325828 فیلتر نتایج به سال:
In text mining area, popular methods use the bagof-words models, which represent a document as a vector. These methods ignored the word sequence information, and the good clustering result limited to some special domains. This paper proposes a new similarity measure based on suffix tree model of text documents. It analyzes the word sequence information, and then computes the similarity between ...
Curriculum for school is generated based on the academic year. Because students have to study several subjects each and every year, the relative topics are put into curricula in discrete. In this study, we propose a method to construct a dynamic learning path which enables us to learn the relative topics continuously. In this process, we define two kinds of similarity score, inheritance score a...
In this paper, we present a novel document similarity measure based on the definition of a graph kernel between pairs of documents. The proposed measure takes into account both the terms contained in the documents and the relationships between them. By representing each document as a graph-of-words, we are able to model these relationships and then determine how similar two documents are by usi...
In this paper we propose an incremental hierarchical clustering algorithm for on-line event detection. This algorithm is applied to a set of newspaper articles in order to discover the structure of topics and events that they describe. In the first level, articles with a high temporal-semantic similarity are clustered together into events. In the next levels of the hierarchy, these events are s...
We propose a detection method for orthographic variants caused by transliteration in a large corpus. The method employs two similarities. One is string similarity based on edit distance. The other is contextual similarity by a vector space model. Experimental results show that the method performed a 0.889 F-measure in an open test.
Ontology is applied to various fields of computer as a conceptual modeling tool, and is used to organize information and manage knowledge. Ontology extension is used to add the new concepts and relationship into the existing ontology, which is a more complex task. In this paper, we propose a hybrid approach for ontology extension from text using semantic relatedness between words, which exploit...
Every day 645 million Twitter users generate approximately 58 million tweets. This motivates the question if it is possible to generate a summary of events from this rich set of tweets only. Key challenges in post summarization from microblog posts include circumnavigating spam and conversational posts. In this study, we present a novel technique called lexi-temporal clustering (LTC), which ide...
This paper makes a simple increment to state-ofthe-art in sarcasm detection research. Existing approaches are unable to capture subtle forms of context incongruity which lies at the heart of sarcasm. We explore if prior work can be enhanced using semantic similarity/discordance between word embeddings. We augment word embedding-based features to four feature sets reported in the past. We also e...
In this paper, we present the methodology and the results obtained by our teams, dubbed Blue Man Group, in the ASSIN (from the Portuguese Avaliação de Similaridade Semântica e Inferência Textual) competition, held at PROPOR 2016. Our team’s strategy consisted of evaluating methods based on semantic word vectors, following two distinct directions: 1) to make use of low-dimensional, compact, feat...
In this paper we describe our solution to the WSDM Cup 2017 Triple Scoring task. Our approach generates a relevance score based on the textual description of the triple’s subject and value (Object). It measures how similar (related) the text description of the subject is to the text description of its values. The generated similarity score can then be used to rank the multiple values associated...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید