Solving Uncertain MDPs with Objectives that Are Separable over Instantiations of Model Uncertainty
نویسندگان
چکیده
Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition (LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature.
منابع مشابه
Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs)
Markov Decision Processes (MDPs) are an effective model to represent decision processes in the presence of transitional uncertainty and reward tradeoffs. However, due to the difficulty in exactly specifying the transition and reward functions in MDPs, researchers have proposed uncertain MDP models and robustness objectives in solving those models. Most approaches for computing robust policies h...
متن کاملSolving Uncertain MDPs by Reusing State Information and Plans
While MDPs are powerful tools for modeling sequential decision making problems under uncertainty, they are sensitive to the accuracy of their parameters. MDPs with uncertainty in their parameters are called Uncertain MDPs. In this paper, we introduce a general framework that allows off-theshelf MDP algorithms to solve Uncertain MDPs by planning based on currently available information and repla...
متن کاملRobust, Risk-Sensitive, and Data-driven Control
Markov Decision Processes (MDPs) model problems of sequential decision-nmaking under uncertainty. They have been studied and applied extensively. Nonetheless, there are two major barriers that still hinder the applicability of MDPs to many more practical decision making problems: * The decision maker is often lacking a reliable MDP model. Since the results obtained by dynamic programming are se...
متن کاملA chance-constrained multi-objective model for final assembly scheduling in ATO systems with uncertain sub-assembly availability
A chance-constraint multi-objective model under uncertainty in the availability of subassemblies is proposed for scheduling in ATO systems. The on-time delivery of customer orders as well as reducing the company's cost is crucial; therefore, a three-objective model is proposed including the minimization of1) overtime, idletime, change-over, and setup costs, 2) total dispersion of items’ deliver...
متن کاملRobust Markov Decision Processes for Medical 1 Treatment Decisions
Medical treatment decisions involve complex tradeoffs between the risks and benefits of various treatment options. The diversity of treatment options that patients can choose over time and uncertainties in future health outcomes, result in a difficult sequential decision making problem. Markov decision processes (MDPs) are commonly used to study medical treatment decisions; however, optimal pol...
متن کامل