نتایج جستجو برای: n grams
تعداد نتایج: 982486 فیلتر نتایج به سال:
This report presents the work carried out at NLE Lab for the QA@CLEF-2009 competition. We used the JIRS passage retrieval system, which is based on redundancy, with the assumption that it is possible to find the response to a question in a large enough document collection. The retrieved passages are ranked depending on the number, length and position of the question n-grams structures found in ...
When messages may be intercepted because they contain certain words, terrorists and criminals may replace such words by other words or locutions. If the replacement words have different frequencies from the original words, techniques to detect the substitution are known. In this paper, we consider ways to detect replacements that have similar frequencies to the original words. We consider the f...
Automatic methods for MT evaluation are often based on the assumption that MT quality is related to some kind of distance between the evaluated text and a professional human translation (e.g., an edit distance or the precision of matched N-grams). However, independently produced human translations are necessarily different, conveying the same content by dissimilar means. Such legitimate transla...
We present a model to perform authorship attribution of tweets using Convolutional Neural Networks (CNNs) over character n-grams. We also present a strategy that improves model interpretability by estimating the importance of input text fragments in the predicted classification. The experimental evaluation shows that text CNNs perform competitively and are able to outperform previous methods.
This article addresses the problem of generating good example contexts to help children learn vocabulary. We describe VEGEMATIC, a system that constructs such contexts by concatenating overlapping five-grams from Google‘s N-gram corpus. We propose and operationalize a set of constraints to identify good contexts. VEGEMATIC uses these constraints to filter, cluster, score, and select example con...
Social tagging systems allow people to classify resources by using a set of freely chosen terms named tags. However, by shifting the classification task from a set of experts to a larger and not trained set of people, the results of the classification are not accurate. The lack of control and guidelines generates noisy tags (i.e. tags without a clear semantics) which deteriorate the precision o...
We give an algorithm for disambiguating generic versus referential uses of secondperson pronouns in restaurant reviews in Chinese. Reviews in this domain use the ‘you’ pronoun 你 either generically or to refer to shopkeepers, readers, or for selfreference in reported conversation. We first show that linguistic features of the local context (drawn from prior literature) help in disambigation. We ...
This paper describes our approach to the Author Identification task in the PAN 2013 evaluation lab. We use a profile-based approach and use the common n-grams (CNG) method that employs a normalized distance measure for short and unbalanced text introduced by Stamatatos[6]. We achieved the 9th place with an overall F1 score of 0.6.
Therapist language plays a critical role in influencing the overall quality of psychotherapy. Notably, it is a major contributor to the perceived level of empathy expressed by therapists, a primary measure for judging their efficacy. We explore psycholinguistics inspired features for predicting therapist empathy. These features model language which conveys information about affective and cognit...
Replacement and removal filtering with help of subject-‐ maBer experts ConDguous word n-‐gram generaDon NormalizaDon of numeric values CalculaDon of the likelihood raDos of the n-‐grams that appear in the test case Deciding on the class based on the informaDve n-‐ grams
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید