Embedding a state space model into a Markov decision process
نویسندگان
چکیده
In agriculture Markov decision processes (MDPs) with finite state and action space are often used to model sequential decision making over time. For instance, states in the process represent possible levels of traits of the animal and transition probabilities are based on biological models estimated from data collected from the animal or herd. State space models (SSMs) are a general tool for modeling repeated measurements over time where the model parameters can evolve dynamically. In this paper we consider methods for embedding an linear normal SSM into an MDP with finite state and action space. Different ways of discretizing an SSM are discussed and methods for reducing the state space of the MDP are presented. An example from dairy production is given.
منابع مشابه
Accelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملTransfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes
Due to physiological variation, patients diagnosed with the same condition may exhibit divergent, but related, responses to the same treatments. Hidden Parameter Markov Decision Processes (HiP-MDPs) tackle this transfer-learning problem by embedding these tasks into a low-dimensional space. However, the original formulation of HiP-MDP had a critical flaw: the embedding uncertainty was modelled ...
متن کاملA Model-Checking Approach to Decision-Theoretic Planning with Non-Markovian Rewards
A popular approach to solving a decision process with non-Markovian rewards (NMRDP) is to exploit a compact representation of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to our favorite MDP solution method. The contribution of this paper is a representation of non-Markovian reward functions and a translation into MDP aimed a...
متن کاملModel Based Method for Determining the Minimum Embedding Dimension from Solar Activity Chaotic Time Series
Predicting future behavior of chaotic time series system is a challenging area in the literature of nonlinear systems. The prediction's accuracy of chaotic time series is extremely dependent on the model and the learning algorithm. On the other hand the cyclic solar activity as one of the natural chaotic systems has significant effects on earth, climate, satellites and space missions. Several m...
متن کاملCall Admission Control in Wireless Ds-cdma Systems Using Reinforcement Learning
THAI) สาขาวิชาวิศวกรรมโทรคมนาคม ลายมือช่ือนักศึกษา ปการศึกษา 2549 ลายมือช่ืออาจารยที่ปรึกษา PITIPONG CHANLOHA : CALL ADMISSION CONTROL IN WIRELESS DS-CDMA SYSTEMS USING REINFORCEMENT LEARNING. THESIS ADVISOR : ASST. PROF. WIPAWEE HATTAGAM, Ph.D. 95 PP. ABSTRACT (ENGLISH) DIRECT-SEQUENTIAL CODE DIVISION MULTIPLE ACCESS (DS-CDMA)/ CALL ADMISSION CONTROL/ REINFORCEMENT LEARNING/ ACTOR-CRITIC REINFO...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Annals OR
دوره 190 شماره
صفحات -
تاریخ انتشار 2011