Neural word embeddings with multiplicative feature interactions for tensor-based compositions
نویسندگان
چکیده
Categorical compositional distributional models unify compositional formal semantic models and distributional models by composing phrases with tensor-based methods from vector representations. For the tensor-based compositions, Milajevs et al. (2014) showed that word vectors obtained from the continuous bag-of-words (CBOW) model are competitive with those from co-occurrence based models. However, because word vectors from the CBOW model are trained assuming additive interactions between context words, the word composition used for the training mismatches to the tensor-based methods used for evaluating the actual compositions including pointwise multiplication and tensor product of context vectors. In this work, we show whether the word embeddings from extended CBOW models using multiplication or tensor product between context words, reflecting the actual composition methods, can show better performance than those from the baseline CBOW model in actual tasks of compositions with multiplication or tensor-based methods.
منابع مشابه
Evaluating Neural Word Representations in Tensor-Based Compositional Settings
We provide a comparative study between neural word representations and traditional vector spaces based on cooccurrence counts, in a number of compositional tasks. We use three different semantic spaces and implement seven tensor-based compositional models, which we then test (together with simpler additive and multiplicative approaches) in tasks involving verb disambiguation and sentence simila...
متن کاملMax-Margin Tensor Neural Network for Chinese Word Segmentation
Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability to alleviate the burden of manual feature engineering. In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN). By exploiting tag embeddings and tensorbased transformation, MMTNN has the ability to ...
متن کاملConvolutional Neural Network with Word Embeddings for Chinese Word Segmentation
Character-based sequence labeling framework is flexible and efficient for Chinese word segmentation (CWS). Recently, many character-based neural models have been applied to CWS. While they obtain good performance, they have two obvious weaknesses. The first is that they heavily rely on manually designed bigram feature, i.e. they are not good at capturing n-gram features automatically. The secon...
متن کاملWord Embeddings via Tensor Factorization
Most popular word embedding techniques involve implicit or explicit factorization of a word co-occurrence based matrix into low rank factors. In this paper, we aim to generalize this trend by using numerical methods to factor higher-order word co-occurrence based arrays, or tensors. We present four word embeddings using tensor factorization and analyze their advantages and disadvantages. One of...
متن کاملClinical Abbreviation Disambiguation Using Neural Word Embeddings
This study examined the use of neural word embeddings for clinical abbreviation disambiguation, a special case of word sense disambiguation (WSD). We investigated three different methods for deriving word embeddings from a large unlabeled clinical corpus: one existing method called Surrounding based embedding feature (SBE), and two newly developed methods: Left-Right surrounding based embedding...
متن کامل