Top a Splitter: Using Distributional Semantics for Improving Compound Splitting
نویسندگان
چکیده
We present a flexible method that rearranges the ranked output of compound splitters (i.e., decomposers of one-word compounds such as the German Kinderlied ‘children’s song’) using a distributional semantics model. In an experiment, we show that our re-ranker improves the quality of various compound splitters.
منابع مشابه
Unsupervised Compound Splitting With Distributional Semantics Rivals Supervised Methods
In this paper we present a word decompounding method that is based on distributional semantics. Our method does not require any linguistic knowledge and is initialized using a large monolingual corpus. The core idea of our approach is that parts of compounds (like “candle” and “stick”) are semantically similar to the entire compound, which helps to exclude spurious splits (like “candles” and “t...
متن کاملSplitting Compounds by Semantic Analogy
Compounding is a highly productive word-formation process in some languages that is often problematic for natural language processing applications. In this paper, we investigate whether distributional semantics in the form of word embeddings can enable a deeper, i.e., more knowledge-rich, processing of compounds than the standard string-based methods. We present an unsupervised approach that ex...
متن کاملImproving search engine retrieval using a compound splitter for Swedish
In this paper we have investigated 128 high frequent Swedish compound queries (6.2 per thousand) with no search results among 1.6 million searches carried out at nine public web sites containing all together 100,000 web pages in Swedish. To these compound queries we added a compound splitter as a pre-processor and we found that after decompounding these queries they gave relevant results in 64 ...
متن کاملAll-Optical Reconfigurable-Tunable 1×N Power Splitter Using Soliton Breakup
In this paper, we numerically simulated a glass-based all-optical 1×N power splitter with eleven different configurations using soliton breakup in a nonlinear medium. It is shown that in addition to reconfigurability of the proposed splitter, its power splitting ratio is tunable up to some extent values too. Nonlinear semivectorial iterative finite difference beam propagation method (IFD-...
متن کاملSemantic transparency: challenges for distributional semantics
Using data from Reddy et al. (2011), we present a series of regression models of semantic transparency in compound nouns. The results indicate that the frequencies of the compound constituents, the semantic relation between the constituents, and metaphorical shift of a constituent or of the compound as a whole, all contribute to the overall perceived level of transparency. While not proposing a...
متن کامل