Identifiability and Unmixing of Latent Parse Trees
نویسندگان
چکیده
This paper explores unsupervised learning of parsing models along two directions.First, which models are identifiable from infinite data? We use a general tech-nique for numerically checking identifiability based on the rank of a Jacobian ma-trix, and apply it to several standard constituency and dependency parsing models.Second, for identifiable models, how do we estimate the parameters efficiently?EM suffers from local optima, while recent work using spectral methods [1] can-not be directly applied since the topology of the parse tree varies across sentences.We develop a strategy, unmixing, which deals with this additional complexity forrestricted classes of parsing models.
منابع مشابه
Parameter Identifiability Issues in a Latent Ma- rkov Model for Misclassified Binary Responses
Medical researchers may be interested in disease processes that are not directly observable. Imperfect diagnostic tests may be used repeatedly to monitor the condition of a patient in the absence of a gold standard. We consider parameter identifiability and estimability in a Markov model for alternating binary longitudinal responses that may be misclassified. Exactly ...
متن کاملLatent SVMs for Semantic Role Labeling using LTAG Derivation Trees∗
A phrase structure parse tree for a sentence can be generated by many different Lexicalized Tree-Adjoining Grammar (LTAG) derivation trees. In this paper, we use multiple LTAG derivations as latent features for semantic role labeling (SRL). We hypothesize that positive and negative examples of individual semantic roles can be reliably distinguished by possibly different latent LTAG-based featur...
متن کاملTwo Languages are Better than One (for Syntactic Parsing)
We show that jointly parsing a bitext can substantially improve parse quality on both sides. In a maximum entropy bitext parsing model, we define a distribution over source trees, target trees, and node-to-node alignments between them. Features include monolingual parse scores and various measures of syntactic divergence. Using the translated portion of the Chinese treebank, our model is traine...
متن کاملDo latent tree learning models identify meaningful structure in sentences?
Recent work on the problem of latent tree learning has made it possible to train neural networks that learn to both parse a sentence and use the resulting parse to interpret the sentence, all without exposure to groundtruth parse trees at training time. Surprisingly, these models often perform better at sentence understanding tasks thanmodels that use parse trees from conventional parsers. This...
متن کاملHyperspectral Unmixing with Endmember Variability using Semi-supervised Partial Membership Latent Dirichlet Allocation
A semi-supervised Partial Membership Latent Dirichlet Allocation approach is developed for hyperspectral unmixing and endmember estimation while accounting for spectral variability and spatial information. Partial Membership Latent Dirichlet Allocation is an effective approach for spectral unmixing while representing spectral variability and leveraging spatial information. In this work, we exte...
متن کامل