Probabilistic Language Modelling
ثبت نشده
چکیده
Language models assign probabilities to strings of symbols. Their interpretation is reviewed and applied to text classification. A language recogniser is constructed from Bayes’ theorem and a simple bigram model. This provides near perfect results on sentences of text and motivates a mixture language model. Hidden Markov models (HMM) are reviewed as a method of capturing order over different length scales and used to construct a mixture model. This allows segmentation of text into unknown languages and the extraction of foreign words in known languages from English text. Future directions are discussed.
منابع مشابه
Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers
This paper describes a unified architecture for integrating sub-lexical models with speech recognition, and a layered framework for context-dependent probabilistic hierarchical sublexical modelling. Previous work [1, 2, 3] has demonstrated the effectiveness of sub-lexical modelling using a core context-free grammar (CFG) augmented with context-dependent probabilistic models. Our major motivatio...
متن کاملA Probabilistic Approach to Modelling Spatial Language with Its Application To Sensor Models
We examine why a probabilistic approach to modelling the various components of spatial language is the most practical for spatial algorithms in which they can be employed, and examine such models for prepositions such as `between' and `by'. We provide an example of such a probabilistic treatment by exploring a novel application of spatial models to the induction of the occupancy of an object in...
متن کاملA hierarchical Dirichlet language model
We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as 'smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions for language modelling. The ideas of this paper are also applicable to other problems such as th...
متن کاملModelling Probabilistic Inference Networks and Classification in Probabilistic Datalog
Probabilistic Graphical Models (PGM) are a well-established approach for modelling uncertain knowledge and reasoning. Since we focus on inference, this paper explores Probabilistic Inference Networks (PIN’s) which are a special case of PGM. PIN’s, commonly referred as Bayesian Networks, are used in Information Retrieval to model tasks such as classification and ad-hoc retrieval. Intuitively, a ...
متن کاملExperiences with Modelling Issues in Building Probabilistic Networks
Building a probabilistic network for a real-life application is a difficult and time-consuming task. Methodologies for building such a network, however, are still lacking. Also, literature on network-specific modelling issues is quite scarce. As we have developed a large probabilistic network for a complex medical domain, we have encountered and resolved numerous non-trivial modelling issues. S...
متن کاملAcronym : QUASIMODO Deliverable no . : D 1 . 1 Title of Deliverable : Modelling Quantitative System Aspects
This deliverable describes the results of the QUASIMODO project on modelling quantitative system aspects. Keyword list: AADL, Arcade, architectural dependability evaluation, cost-bounded reachability, priced priced/weighted timed automata, probabilistic timed automata, probabilistic timed automata, probabilistic hybrid systems. ICT-FP7-STREP-214755 / QUASIMODO Page 2 of 12 Public
متن کامل