نتایج جستجو برای: policy option
تعداد نتایج: 333995 فیلتر نتایج به سال:
We present new results on learning temporally extended actions for continuous tasks, using the options framework (Sutton et al. [1999b], Precup [2000]). In order to achieve this goal we work with the option-critic architecture (Bacon et al. [2017]) using a deliberation cost and train it with proximal policy optimization (Schulman et al. [2017]) instead of vanilla policy gradient. Results on Muj...
forest and colleagues have persuasively made the case that policy capacity is a fundamental prerequisite to health reform. they offer a comprehensive life-cycle definition of policy capacity and stress that it involves much more than problem identification and option development. i would like to offer a canadian perspective. if we define health reform as re-orienting the health system from acut...
Temporal abstraction is key to scaling up learning and planning in reinforcement learning. While planning with temporally extended actions is well understood, creating such abstractions autonomously from data has remained challenging. We tackle this problem in the framework of options [Sutton, Precup & Singh, 1999; Precup, 2000]. We derive policy gradient theorems for options and propose a new ...
This paper incorporates an option value into deforestation policy analysis. Similar to an option value in finance, the option value here reflects the advantage to delaying irreversible species extinction until more information about the uncertain value of species is known. The return from species is modeled as a stochastic flow of benefits which ceases if policy makers choose to deforest. Defor...
Temporally extended actions (or macro-actions) have proven useful for speeding up planning and learning, adding robustness, and building prior knowledge into AI systems. The options framework, as introduced in Sutton, Precup and Singh (1999), provides a natural way to incorporate macro-actions into reinforcement learning. In the subgoals approach, learning is divided into two phases, first lear...
A new family of gradient temporal-difference learning algorithms have recently been introduced by Sutton, Maei and others in which function approximation is much more straightforward. In this paper, we introduce the GQ(λ) algorithm which can be seen as extension of that work to a more general setting including eligibility traces and off-policy learning of temporally abstract predictions. These ...
The “conservative central banker” has come under attack recently. Explicitly modeling the interaction of a trade union with monetary policy, it has been argued that the standard solution to the inflationary bias in monetary policy might actually be welfare-reducing if the trade union has an exogenous preference against inflation. We reframe this discussion in a standard trade union model. We sh...
OBJECTIVES Retaining residual newborn screening (NBS) bloodspots for medical research remains contentious. To inform this debate, we sought to understand public preferences for, and reasons for preferring, alternative policy options. METHODS We assessed preferences among 4 policy options for research use of residual bloodspots through a bilingual national Internet survey of a representative s...
A temporally abstract action, or an option, is specified by a policy and a termination condition: the policy guides option behavior, and the termination condition roughly determines its length. Generally, learning with longer options (like learning with multi-step returns) is known to be more efficient. However, if the option set for the task is not ideal, and cannot express the primitive optim...
Increased market access from Free Trade Agreements (FTAs) promised by policy makers is often diluted by preferential rules of origin (ROO). This paper discusses two policy options -one direct, and one indirect -with regard to limiting the impact of NAFTA ROO on trade, and illustrates the impact on GDP and welfare of these options using a computable general equilibrium methodology. The first (di...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید