Unifying Bayesian Inference and Vector Space Models for Improved Decipherment
نویسندگان
چکیده
We introduce into Bayesian decipherment a base distribution derived from similarities of word embeddings. We use Dirichlet multinomial regression (Mimno and McCallum, 2012) to learn a mapping between ciphertext and plaintext word embeddings from non-parallel data. Experimental results show that the base distribution is highly beneficial to decipherment, improving state-of-the-art decipherment accuracy from 45.8% to 67.4% for Spanish/English, and from 5.1% to 11.2% for Malagasy/English.
منابع مشابه
Bayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملLocation Reparameterization and Default Priors for Statistical Analysis
This paper develops default priors for Bayesian analysis that reproduce familiar frequentist and Bayesian analyses for models that are exponential or location. For the vector parameter case there is an information adjustment that avoids the Bayesian marginalization paradoxes and properly targets the prior on the parameter of interest thus adjusting for any complicating nonlinearity the details ...
متن کاملCost Analysis of Acceptance Sampling Models Using Dynamic Programming and Bayesian Inference Considering Inspection Errors
Acceptance Sampling models have been widely applied in companies for the inspection and testing the raw material as well as the final products. A number of lots of the items are produced in a day in the industries so it may be impossible to inspect/test each item in a lot. The acceptance sampling models only provide the guarantee for the producer and consumer that the items in the lots are acco...
متن کاملBayesian Inference for Zodiac and Other Homophonic Ciphers
We introduce a novel Bayesian approach for deciphering complex substitution ciphers. Our method uses a decipherment model which combines information from letter n-gram language models as well as word dictionaries. Bayesian inference is performed on our model using an efficient sampling technique. We evaluate the quality of the Bayesian decipherment output on simple and homophonic letter substit...
متن کاملExpectation Propagation in Gaussian Process Dynamical Systems
Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an ...
متن کامل