Sequential Decision Problems
نویسنده
چکیده
To this point in the class, we have studies stateless games when the payoff matrix is known. We have explored these games as both stage (single shot) games and as repeated games. In this portion of the class, we will continue to study repeated games, but we will assume as little about the problems as possible and we will look at a class of algorithms that allow us to represent states. To be precise, we will assume that we don’t know the game matrix, we don’t know anything about the strategy of the other agent, and we don’t know what strategy we should apply to do well in the repeated game. Within this constraints, we will explore a class of reinforcement learning algorithms that attempt to learn various parameters of the repeated game. The purpose of this tutorial is to introduce you to the necessary concepts to understand reinforcement learning algorithms.
منابع مشابه
Convergence in a sequential two stages decision making process
We analyze a sequential decision making process, in which at each stepthe decision is made in two stages. In the rst stage a partially optimalaction is chosen, which allows the decision maker to learn how to improveit under the new environment. We show how inertia (cost of changing)may lead the process to converge to a routine where no further changesare made. We illustrate our scheme with some...
متن کاملA Review of Representation Issues and Modeling Challenges with Influence Diagrams
Since their introduction in the mid 1970s, influence diagrams have become a de facto standard for representing Bayesian decision problems. The need to represent complex problems has led to extensions of the influence diagram methodology designed to increase the ability to represent complex problems. In this paper, we review the representation issues and modeling challenges associated with influ...
متن کاملSolving Sequential Decision-making Problems under Virtual Reality Simulation System
A large class of problems of sequential decision-making can be modeled as Markov or Semi-Markov Decision Problems, which can be solved by classical methods of dynamic programming. However, the computational complexity of the classical MDP algorithms, such as value iteration and policy iteration, is prohibitive and will grow intractably with the size of problems. Furthermore, they require for ea...
متن کاملSequential Valuation Networks: A New Graphical Technique for Asymmetric Decision Problems
This paper deals with representation and solution of asymmetric decision problems. We describe a new graphical representation called sequential valuation networks, which is a hybrid of Covaliu and Oliver’s sequential decision diagrams and Shenoy’s asymmetric valuation networks. Sequential valuation networks inherit many of the strengths of sequential decision diagrams and asymmetric valuation n...
متن کاملSequential decision problems, dependently-typed solutions
We propose a dependently typed formalization for a simple class of sequential decision problems. For this class of problems, we implement a generic version of Bellman’s backwards induction algorithm [2] and a machine checkable proof that the proposed implementation is correct. The formalization is generic. It is presented in Idris, but it can be easily translated to other dependently-typed prog...
متن کاملAdaptive Concentration Inequalities for Sequential Decision Problems
A key challenge in sequential decision problems is to determine how many samples are needed for an agent to make reliable decisions with good probabilistic guarantees. We introduce Hoeffding-like concentration inequalities that hold for a random, adaptively chosen number of samples. Our inequalities are tight under natural assumptions and can greatly simplify the analysis of common sequential d...
متن کامل