Empirical Comparison of Evaluation Methods for Unsupervised Learning of Morphology
نویسندگان
چکیده
Unsupervised and semi-supervised learning of morphology provide practical solutions for processing morphologically rich languages with less human labor than the traditional rule-based analyzers. Direct evaluation of the learning methods using linguistic reference analyses is important for their development, as evaluation through the final applications is often time consuming. However, even linguistic evaluation is not straightforward for full morphological analysis, because the morpheme labels generated by the learning method can be arbitrary. We review the previous evaluation methods for the learning tasks and propose new variations. In order to compare the methods, we perform an extensive meta-evaluation using the large collection of results from the Morpho Challenge competitions. RÉSUMÉ. L’apprentissage non supervisé et semi-supervisé de la morphologie fournit des solutions pratiques pour le traitement des langues morphologiquement riches et requiert une intervention humaine réduite comparée aux analyseurs traditionnels basés sur des règles. L’évaluation directe des méthodes d’apprentissage utilisant des analyses de référence linguistique est importante pour leur développement, puisque l’évaluation par les applications finales prend généralement beaucoup de temps. Cependant, même l’évaluation linguistique n’est pas simple pour l’analyse morphologique complète, car les identifiants de morphèmes générés par la méthode d’apprentissage peuvent se révéler arbitraires. Nous passons en revue les méthodes d’évaluation existantes pour les tâches d’apprentissage et proposons de nouvelles variations. Afin de comparer les méthodes, nous effectuons une vaste méta-évaluation à l’aide de l’importante base de résultats provenant des compétitions Morpho Challenge.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملLearning Constructions of Natural Language: Statistical Models and Evaluations
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Sami Virpioja Name of the doctoral dissertation Learning Constructions of Natural Language: Statistical Models and Evaluations Publisher School of Science Unit Department of Information and Computer Science Series Aalto University publication series DOCTORAL DISSERTATIONS 158/2012 Field of research Computer and Information Sci...
متن کاملAdvances in Weakly Supervised Learning of Morphology
Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Oskar Kohonen Name of the doctoral dissertation Advances in Weakly Supervised Learning of Morphology Publisher School of Science Unit Department of Computer Science Series Aalto University publication series DOCTORAL DISSERTATIONS 91/2015 Field of research Language Technology Manuscript submitted 19 January 2014 Date of the de...
متن کاملComparison Between Unsupervised and Supervise Fuzzy Clustering Method in Interactive Mode to Obtain the Best Result for Extract Subtle Patterns from Seismic Facies Maps
Pattern recognition on seismic data is a useful technique for generating seismic facies maps that capture changes in the geological depositional setting. Seismic facies analysis can be performed using the supervised and unsupervised pattern recognition methods. Each of these methods has its own advantages and disadvantages. In this paper, we compared and evaluated the capability of two unsuperv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TAL
دوره 52 شماره
صفحات -
تاریخ انتشار 2011