نتایج جستجو برای: speech tagging
تعداد نتایج: 128613 فیلتر نتایج به سال:
Aspects of Chinese syntax result in a distinctive mix of parsing challenges. However, the contribution of individual sources of error to overall difficulty is not well understood. We conduct a comprehensive automatic analysis of error types made by Chinese parsers, covering a broad range of error types for large sets of sentences, enabling the first empirical ranking of Chinese error types by t...
We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character biLSTM (Ballesteros et al., 2015; Plank et al., 2016). We model two important interfaces of constituency parsing with auxiliary tasks supervised at the word level: (i) part-of-speech (POS) and morphological...
The Stuttgart-Tübingen Tag Set (STTS) (Schiller et al., 1995) has long been established as a quasi-standard for part-of-speech (POS) tagging of German. It has been used, with minor modifications, for the annotation of three German newspaper treebanks, the NEGRA treebank (Skut et al., 1997), the TiGer treebank (Brants et al., 2002) and the TüBa-D/Z (Telljohann et al., 2004). One major drawback, ...
Brill tagging is a classic rule-based algorithm for part-of-speech tagging within Natural Language Processing. However, implementation of the tagger is inherently slow on conventional Von Neumann architectures. In this paper, we accelerate the second stage of Brill tagging on the Micron Automata Processor, a new computing architecture that can perform massive pattern matching in parallel. The d...
Recent scientific publications on data-driven part-of-speech tagging of Sub-Saharan African languages have reported encouraging accuracy scores, using off-the-shelf tools and often fairly limited amounts of training data. Unfortunately, no research efforts exist that explore which type of linguistic features contribute to accurate part-of-speech tagging for the languages under investigation. Th...
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTMRNN) has been shown to be very effective for tagging sequential data, e.g. speech utterances or handwritten documents. While word embedding has been demoed as a powerful representation for characterizing the statistical properties of natural language. In this study, we propose to use BLSTM-RNN with word embedding for part-of-sp...
A pilot study on inducing rules for part of speech tagging of unrestricted Swedish text is reported. Using the Progol machine-learning system, Constraint Grammar inspired rules were learnt from the part of speech tagged Stockholm-Ume a Corpus. Several thousand disambiguation rules discarding faulty readings of ambiguously tagged words were induced. When tested on unseen data, 97% of the words r...
This paper describes a method of detecting speechrepairs that uses a part-of-speech tagger. The tagger is given knowledge about category transitions for speechrepairs, and so is able to mark a transition either as a likely repair or as fluent speech. Other contextual clues, such as editing terms, word fragments, and word matchings, are also factored in by modifying the transition probabilities.
This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-ofspeech. We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus....
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید