نتایج جستجو برای: linguistic corpus
تعداد نتایج: 113027 فیلتر نتایج به سال:
A semi-automatic procedure of linguistic knowledge acquisition is proposed, which combines corpus-based techniques with the conventional rule-based approach. The rule-based component generates all the possible hypotheses of defects which the existing linguistic knowledge might contain, when it fails to parse a sentence. The rule-based component does not try to identify the defects, but generate...
In this paper the process for turning a dependency-based corpus to a constituentbased one is explained. For this purpose, first both the Dependency and the Constituent formalism are analized and then the corresponding equivalences of linguistic phenomena are treated. This process has had different phases in which the linguistic equivalences have been improved. Finally, the evaluation process is...
• frequencies of occurrence of linguistic elements, which can be studied from two different perspectives: o how frequent are morphemes or words or patterns/constructions in (parts of) a corpus? This information can be provided in various different forms of frequency lists; o how evenly are morphemes or words or patterns/constructions distributed across (parts of) a corpus? This information can ...
In this paper, we discuss our creation of a web corpus of spoken Hindi (COSH), one of the Indo-Aryan languages spoken mainly in the Indian subcontinent. We also point out notable problems we’ve encountered in the web corpus and the special concordancer. After observing the kind of technical problems we encountered, especially regarding annotation tagged by Shiva Reddy’s tagger, we argue how the...
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus to provide phrasal translation memories. The method integrates statistical and linguistic information to achieve effective extraction of bilingual collocations. The linguistic information includes parts of speech, chunks, and clauses. The method involves first obtaining an extended list of Englis...
The affects are expressed in different levels of speech: metalinguistic (expressiveness), linguistic (attitudes), both anchored in the “linguistic time”, and para-linguistic (emotions expressions) that is anchored in the emotional causes timing. In an experimental approach, the corpus are the base of analysis. Main of emotional corpus have been produced by acting/elicitating speakers on one sid...
In this paper, we evaluate a set of linguistic rules for pronunciation variations in Singapore English. We collect and annotate a speech corpus for Singapore English and label it with IPA narrow transcriptions. Data driven pronunciation rules are derived using American English (Buckeye corpus) as a reference. We compare the data driven rules with linguistic rules proposed by phoneticians, and f...
We discuss the treatment of ellipsis in a spoken language route planning enquiry service which uses the Core Language Engine (CLE) as its linguistic processor. We show how use of the CLE allows us to separate the interpretation of ellipsis in a dialogue context from the more general issue of dialogue management in a dialogue context and, especially, to factor out the linguistic innuences on suc...
Web is a rich and diversified source of information. In this article, we propose to benefit from this richness to collect and analyze documents, with the aim of a relational indexation based on noun phrases. Proposed data processing chain includes a spider collecting data to build textual corpora, and a linguistic module analyzing text to extract information. Comparison of obtained corpus with ...
This paper presents a new linguistic resource for the study and computational processing of Portuguese. CINTIL DependencyBank PREMIUM is a corpus of Portuguese news text, accurately manually annotated with a wide range of linguistic information (morpho-syntax, named-entities, syntactic function and semantic roles), making it an invaluable resource specially for the development and evaluation of...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید