نتایج جستجو برای: dyna

تعداد نتایج: 855  

2015
Teng Liu Yuan Zou Fengchun Sun Joeri Van Mierlo Ming Cheng Omar Hegazy

This paper presents a reinforcement learning (RL)–based energy management strategy for a hybrid electric tracked vehicle. A control-oriented model of the powertrain and vehicle dynamics is first established. According to the sample information of the experimental driving schedule, statistical characteristics at various velocities are determined by extracting the transition probability matrix of...

1990
Richard S. Sutton

This paper extends previous work with Dyna a class of architectures for intelligent systems based on approximating dynamic program ming methods Dyna architectures integrate trial and error reinforcement learning and execution time planning into a single process operating alternately on the world and on a learned model of the world In this paper I present and show results for two Dyna archi tect...

2001
Gianluca Baldassarre

planning with neural networks, time limits of discounted reinforcement learning Planning, taskability, Dyna-PI architectures Dyna-PI architectures: focussing, forward and backward planning, acting and (re)planning. Tested with... Ideas from problem solving and

1991
Richard S. Sutton

Dyna is an AI architecture that integrates learning, planning, and reactive execution. Learning methods are used in Dyna both for compiling planning results and for updating a model of the eeects of the agent's actions on the world. Planning is incre-mental and can use the probabilistic and ofttimes incorrect world models generated by learning processes. Execution is fully reactive in the sense...

Journal: :The Journal of pharmacology and experimental therapeutics 1998
L Chen L Y Huang

We examined the non-opioid actions of various forms of dynorphin A (DynA) on N-methyl-D-aspartate (NMDA) receptor channels in isolated rat trigeminal neurons using the whole-cell patch recording technique. All the dynorphins tested blocked NMDA-activated currents. The blocking actions were voltage-independent. The IC50 was 0.26 microM for DynA(1-32), 6.6 microM for DynA(1-17) 7.4 microM for Dyn...

1992
Satinder P. Singh

Reinforcement learning (RL) algorithms have traditionally been thought of as trial and error learning methods that use actual control experience to incrementally improve a control policy. Sutton's DYNA architecture demonstrated that RL algorithms can work as well using simulated experience from an environment model, and that the resulting computation was similar to doing one-step lookahead plan...

2012
Kenshiro Kondo

The LS-DYNA nonlinear finite element analysis software package developed for structural analysis by the Livermore Software Technology Corporation (LSTC) is widely used by the automobile, aerospace, construction, military, manufacturing, and bioengineering industries. Fujitsu has been a partner with LSTC since 1996, supporting customers in Japan. A common application of LS-DYNA is car crash simu...

2015
Yicheng Zhou Quan Liu Qi-ming Fu Zongzhang Zhang

Traditional online learning algorithms often suffer from the lack of convergence rate and accuracy. The Dyna-2 framework, combining learning with searching methods, provides a way of alleviating the problem. The main idea behind it is to execute a simulation-based search that helps the learning process to select better actions. The search process relies on a simulated model of the environment t...

Journal: :J. Inf. Sci. Eng. 2014
Yuan-Pao Hsu Wei-Cheng Jiang

In this paper, we present a rapid learning algorithm called Dyna-QPC. The proposed algorithm requires considerably less training time than Q-learning and Table-based Dyna-Q algorithm, making it applicable to real-world control tasks. The Dyna-QPC algorithm is a combination of existing learning techniques: CMAC, Q-learning, and prioritized sweeping. In a practical experiment, the Dyna-QPC algori...

2006
ISTVÁN SZITA

Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید