Automatic Analysis of Hungarian Texts and Linguistic Data

نویسنده

  • Ferenc Papp
چکیده

1. First of all I would like to give an account of the practical experience gained in the course of processing the about 60,000 or so entries of a Hungarian unilingual (explanatory) dictionary (.4 magyar nyelv 3rtelmez6 szdt,~ra, vv. I-VII, 1959-1962). In this case by " t e x t " we mean this non-natural corpus, that is the sum total of the entries of the dic• tionary; and by linguistic data the information given below.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

magyarlanc: A Tool for Morphological and Dependency Parsing of Hungarian

Hungarian is the stereotype of morphologically rich and free word order languages. Here, we introduce magyarlanc, a natural language toolkit developed for the linguistic preprocessing – segmentation, morphological analysis, POS-tagging and dependency parsing – of Hungarian texts. We hope that the free availability of the toolkit fosters the research not just on the Hungarian language but on all...

متن کامل

Universal Morphology for Old Hungarian

This paper provides a description of the automatic conversion of the morphologically annotated part of the Old Hungarian Corpus. These texts are in the format of the Humor analyzer, which does not follow any international standards. Since standardization always facilitates future research, even for researchers who do not know the Old Hungarian language, we opted for mapping the Humor formalism ...

متن کامل

Referential Cohesion in Hungarian: A Developmental Study

Discursive functions are shared across all languages, but each language uses different linguistic means to appropriately establish referential cohesion. Children’s mastery of this cohesion in narrative texts develops gradually and is influenced by development in syntax. Consequently, speakers can employ different strategies, and among the various structural configurations of arguments, some are...

متن کامل

Applying Multi-Dimensional Analysis to a Russian Webcorpus: Searching for Evidence of Genres

The paper presents an application of Multidimensional (MD) analysis initially developed for the analysis of register variation in English (Biber, 1988) to the investigation of a genre diverse corpus, which was built from modern texts of the Russian Web. The analysis is based on the idea that each linguistic feature has different frequencies in different registers, and statistically stable co-oc...

متن کامل

Automatic Word Clustering in Studying Semantic Structure of Texts

The purpose of the study is to prove that results of automatic word clustering (AWC) may contribute much in investigating semantic structure of texts and in evaluating plot complexity. Experiments were carried out for Russian texts, mainly stories and short novels. Data obtained in course of study allowed to formulate and verify several linguistic hypotheses.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1973