نتایج جستجو برای: policy space
تعداد نتایج: 747131 فیلتر نتایج به سال:
in italy, several policy measures have been implemented in order to increase energy efficiencyand reduce carbon emissions especially concerning the household sector. however, in order to design andimplement these policy measures efficiently, it is necessary to get a better understanding of the factorsinfluencing household energy behavior. in this paper, by using disaggregated data from the 2009...
Consider the problem of approximating the optimal policy of a Markov decision process (MDP) by sampling state transitions. In contrast to existing reinforcement learning methods that are based on successive approximations to the nonlinear Bellman equation, we propose a Primal-Dual π Learning method in light of the linear duality between the value and policy. The π learning method is model-free ...
Turkey's foreign policy, after Islamists’ coming over in 2002, has created several discussions in academic and political circles. Turkey’s active policy in Middle East in recent years caused some analysts evaluate this Turkey’s attitude change on regional issues as due to Justice and Development Party’s historical and religious tendencies and its evading from Kemalism west-oriented nationalist...
We study the regulation of one-way station-based vehicle sharing systems through parking reservation policies. We measure the performance of these systems in terms of the total excess travel time of all users caused as a result of vehicle or parking space shortages. We devise mathematical programming based bounds on the total excess travel time of vehicle sharing systems under any passive regul...
We consider the problem of finding an optimal policy in a Markov decision process that maximises the expected discounted sum of rewards over an infinite time horizon. Since the explicit iterative dynamical programming scheme does not scale when increasing the dimension of the state space, a number of approximate methods have been developed. These are typically based on value or policy iteration...
Extraction of Reward-Related Feature Space Using Correlation-Based and Reward-Based Learning Methods
The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to ext...
We study two regularization-based approximate policy iteration algorithms, namely REGLSPI and REG-BRM, to solve reinforcement learning and planning problems in discounted Markov Decision Processes with large state and finite action spaces. The core of these algorithms are the regularized extensions of the Least-Squares Temporal Difference (LSTD) learning and Bellman Residual Minimization (BRM),...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید