نتایج جستجو برای: واژه سازی word building

تعداد نتایج: 440903  

Journal: :Procesamiento del Lenguaje Natural 2013
Andres Duque Fernandez Lourdes Araujo Juan Martinez-Romo

In this paper we present preliminary results obtained by the application of a new technique for building semantic graphs to the task of cross-lingual word sense disambiguation. Through the use of this unsupervised technique, we induce the senses associated with the translations of the ambiguous word in the target language. For this purpose, we use the translation of the words in the context of ...

Journal: :CoRR 2017
Syed Sarfaraz Akhtar Arihant Gupta Avijit Vajpayee Arjit Srivastava Manish Shrivastava

We present a language independent, unsupervised method for building word embeddings using morphological expansion of text. Our model handles the problem of data sparsity and yields improved word embeddings by relying on training word embeddings on artificially generated sentences. We evaluate our method using small sized training sets on eleven test sets for the word similarity task across seve...

2017
Zhe Zhao Tao Liu Shen Li Bofang Li Xiaoyong Du

The existing word representation methods mostly limit their information source to word co-occurrence statistics. In this paper, we introduce ngrams into four representation methods: SGNS, GloVe, PPMI matrix, and its SVD factorization. Comprehensive experiments are conducted on word analogy and similarity tasks. The results show that improved word representations are learned from ngram cooccurre...

Journal: :CoRR 2013
Eduardo G. Altmann Zakary L. Whichard Adilson E. Motter

The word-stock of a language is a complex dynamical system in which words can be created, evolve, and become extinct. Even more dynamic are the short-term fluctuations in word usage by individuals in a population. Building on the recent demonstration that word niche is a strong determinant of future rise or fall in word frequency, here we introduce a model that allows us to distinguish persiste...

2001
Cornelis Hoede L. Zhang

This is the third paper in a series of natural language processing in term of knowledge graphs. A word is a basic unit in natural language processing. This is why we study word graphs. Word graphs were already built for prepositions and adwords (including adjectives, adverbs and Chinese quantity words) in two other papers [2], [3]. In this paper, we propose the concept of the logic word and cla...

2004

In word sense disambiguation, a system attempts to determine the sense of word from contextual features. Major barriers to building a highperforming word sense disambiguation system include the difficulty of labeling data for this task and of predicting fine-grained sense distinctions. In contrast, we can use parallel language corpora as a large supply of potential data. In this paper we presen...

2014
Xiaodong Zeng Lidia S. Chao Derek F. Wong Isabel Trancoso Liang Tian

This study investigates on building a better Chinese word segmentation model for statistical machine translation. It aims at leveraging word boundary information, automatically learned by bilingual character-based alignments, to induce a preferable segmentation model. We propose dealing with the induced word boundaries as soft constraints to bias the continuous learning of a supervised CRFs mod...

2013
Amir Hazem Emmanuel Morin

Methods dealing with bilingual lexicon extraction from comparable corpora are often based on word co-occurrence observation and are by essence more effective when using large corpora. In most cases, specialized comparable corpora are of small size, and this particularity has a direct impact on bilingual terminology extraction results. In order to overcome insufficient data coverage and to make ...

2011
Shirley N. Dita Rachel E. O. Roxas

This paper presents the work being done so far on the building of online corpus for Philippine languages. As for the status, the Philippine Languages Online Corpora (PLOC) now boasts a 250,000-word written corpus of the eight major languages in the archipelago. Some of the issues confronting the corpus building and future directions for this project are likewise discussed in this paper.

2010
Baiju Mahananda C. M. S. Raju Ramalinga Reddy Patil Narayana Jha Shrinivasa Varakhedi Prahallad Kishore

This paper describes about the work done in building a prototype text to speech system for Sanskrit. A basic prototype text-tospeech is built using a simplified Sanskrit phone set, and employing a unit selection technique, where prerecorded sub-word units are concatenated to synthesize a sentence. We also discuss the issues involved in building a full-fledged text-to-speech for Sanskrit.

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید