A Unified Gradient-Descent/Clustering Architecture for Finite State Machine Induction
نویسندگان
چکیده
Although recurrent neural nets have been moderately successful in learning to emulate finite-state machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. DOLCE consists of a standard recurrent neural net trained by gradient descent and an adaptive clustering technique that quantizes the state space. DOLCE is based on the assumption that a finite set of discrete internal states is required for the task, and that the actual network state belongs to this set but has been corrupted by noise due to inaccuracy in the weights. DOLCE learns to recover the discrete state with maximum a posteriori probability from the noisy state. Simulations show that DOLCE leads to a significant improvement in generalization performance over earlier neural net approaches to FSM induction.
منابع مشابه
Gradient - Descent / ClusteringArchitecture forFinite State Machine Induction
Although recurrent neural nets have been moderately successful in learning to emulate nite-state machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. dolce consists of a standard recurrent neural net trained...
متن کاملDynamic On-line Clustering and State Extraction: An Approach to Symbolic Learning
Although recurrent neural nets have been moderately successful in learning to emulate finite-state machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. DOLCE consists of a standard recurrent neural net train...
متن کاملUnifying the Stochastic Spectral Descent for Restricted Boltzmann Machines with Bernoulli or Gaussian Inputs
Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models. A Restricted Boltzmann Machine (RBM) is a probabilistic generative model that can be stacked to construct deep architectures. For RBM with Bernoulli inputs, non-Euclidean algorithm such as stochastic spectral descent (SSD) has been specifically designed to speed up th...
متن کاملLearning Finite-State Controllers for Partially Observable Environments
Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable MDP. Policies with finite memory can be represented as finite-state automata. In this paper, we extend Baird and Moore’s VAPS algorithm to the problem of learning general finite-state automata. Because...
متن کاملStochastic Approximation and Least-Squares Regression, with Applications to Machine Learning. (Approximation Stochastique et Régression par Moindres Carrés : Applications en Apprentissage Automatique)
Many problems in machine learning are naturally cast as the minimization of a smooth function defined on a Euclidean space. For supervised learning, this includes least-squares regression and logistic regression. While small-scale problems with few input features may be solved efficiently by many optimization algorithms (e.g., Newton’s method), large-scale problems with many high-dimensional fe...
متن کامل