نتایج جستجو برای: evaluation metrics
تعداد نتایج: 878773 فیلتر نتایج به سال:
This article outlines the evaluation protocol and provides the main results of the French Evaluation Campaign for Machine Translation Systems, CESTA. Following the initial objectives and evaluation plans, the evaluation metrics are briefly described: along with fluency and adequacy assessed by human judges, a number of recently proposed automated metrics are used. Two evaluation campaigns were ...
We present a comparative study on Machine Translation Evaluation according to two different criteria: Human Likeness and Human Acceptability. We provide empirical evidence that there is a relationship between these two kinds of evaluation: Human Likeness implies Human Acceptability but the reverse is not true. From the point of view of automatic evaluation this implies that metrics based on Hum...
String-based metrics of automatic machine translation (MT) evaluation are widely applied in MT research. Meanwhile, some linguistic motivated metrics have been suggested to improve the string-based metrics in sentencelevel evaluation. In this work, we attempt to change their original calculation units (granularities) of string-based metrics to generate new features. We then propose a powerful s...
A number of approaches to Automatic MT Evaluation based on deep linguistic knowledge have been suggested. However, n-gram based metrics are still today the dominant approach. The main reason is that the advantages of employing deeper linguistic information have not been clarified yet. In this work, we propose a novel approach for meta-evaluation of MT evaluation metrics, since correlation coffi...
In this paper an evaluation model for object-oriented (OO) metrics is proposed. We have evaluated the existing evaluation criteria for OO metrics, and based on the observations, a model is proposed which tries to cover most of the features for the evaluation of OO metrics. The model is validated by applying it to existing OO metrics. In contrast to the other existing criteria, the proposed mode...
We present two new metrics for evaluating generative models in the class-conditional image generation setting. These are obtained by generalizing most popular unconditional metrics: Inception Score (IS) and Fre'chet Distance (FID). A theoretical analysis shows motivation behind each proposed metric links novel to their counterparts. The link takes form of a product case IS or an upper bound FID...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید