morphosyntactic features

نتایج جستجو برای: morphosyntactic features

تعداد نتایج: 524361 فیلتر نتایج به سال:

FrameBank: A Database of Russian Lexical Constructions

2015

Olga Lyashevskaya Egor Kashkin

Russian FrameBank is a bank of annotated samples from the Russian National Corpus which documents the use of lexical constructions (e.g. argument constructions of verbs and nouns). FrameBank belongs to FrameNetoriented resources, but unlike Berkeley FrameNet it focuses more on the morphosyntactic and semantic features of individual lexemes rather than the generalized frames, following the theor...

متن کامل

The Universal Dependencies Treebank for Slovenian

2017

Kaja Dobrovoljc Tomaz Erjavec Simon Krek

This paper introduces the Universal Dependencies Treebank for Slovenian. We overview the existing dependency treebanks for Slovenian and then detail the conversion of the ssj200k treebank to the framework of Universal Dependencies version 2. We explain the mapping of part-of-speech categories, morphosyntactic features, and the dependency relations, focusing on the more problematic language-spec...

متن کامل

Application of Different Techniques to Dependency Parsing of Basque

2010

Kepa Bengoetxea Koldo Gojenola

We present a set of experiments on dependency parsing of the Basque Dependency Treebank (BDT). The present work has examined several directions that try to explore the rich set of morphosyntactic features in the BDT: i) experimenting the impact of morphological features, ii) application of dependency tree transformations, iii) application of a two-stage parsing scheme (stacking), and iv) combin...

متن کامل

On lemmatization in Arabic,

2001

Joseph Dichy

This work is a ‘prospective extension’ of the lexical work achieved in the DIINAR-MBC Euro-Mediterranean project. It aims at contributing to the crucial issue in the field of Arabic NLP of the operations involved in lemmatization, which are necessarily based on a definition of the Arabic entries of a monolingual or multilingual lexical database. As shown in previous work, lexical entries can be...

متن کامل

Incorporating Information Status into Generation Ranking

2009

Aoife Cahill Arndt Riester

We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a loglinear surface realisation ranking model. We show that the distribution of pairs of IS categories is strongly asymmetric. Moreover, each category is correlated with morphosyntactic features, which can be automatically detected. We build a loglinear model that incorporates...

متن کامل

Annotating Complex Linguistic Features in Bilingual Corpora: The Case of MULTINOT

2017

Julia Lavid

In spite of the current need in the computational community for digital corpora in different languages with complex linguistic annotations going beyond morphosyntactic features, there is not much work within the Digital Humanities community dedicated to this task. In this paper I describe recent work on the development of a bilingual (English-Spanish) corpus consisting of original comparable an...

متن کامل

Private or Corporate? Predicting User Types on Twitter

2016

Nikola Ljubesic Darja Fiser

In this paper we present a series of experiments on discriminating between private and corporate accounts on Twitter. We define features based on Twitter metadata, morphosyntactic tags and surface forms, showing that the simple bag-of-words model achieves single best results that can, however, be improved by building a weighted soft ensemble of classifiers based on each feature type. Investigat...

متن کامل

Harnessing the CRF Complexity with Domain-Specific Constraints. The Case of Morphosyntactic Tagging of a Highly Inflected Language

2012

Jakub Waszczuk

We describe a domain-specific method of adapting conditional random fields (CRFs) to morphosyntactic tagging of highly-inflectional languages. The solution involves extending CRFs with additional, position-wise restrictions on the output domain, which are used to impose consistency between the modeled label sequences and morphosyntactic analysis results both at the level of decoding and, more i...

متن کامل

An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation

2007

Adam Przepiórkowski Aleksander Buczyński

This article presents a formalism and an open source implementation of a new tool for simultaneous partial parsing and morphosyntactic disambiguation and correction. We argue that, contrary to the common pipeline approach, where morphosyntactic tagging is fully accomplished before shallow or partial parsing, both tasks are best approached in parallel. This has been suggested before, and formali...

متن کامل

MULTEXT-East Version 4: Multilingual Morphosyntactic Specifications, Lexicons and Corpora

2010

Tomaz Erjavec

The paper presents the fourth, “Mondilex” edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development, focused on the morphosyntactic level of linguistic description. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specif...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید