نتایج جستجو برای: الگوریتم fuzzy sarsa

تعداد نتایج: 112094  

2009
Nassima Aissani Bouziane Beldjilali

Petroleum industry production systems are highly automatized. In this industry, all functions (e.g., planning, scheduling and maintenance) are automated and in order to remain competitive researchers attempt to design an adaptive control system which optimizes the process, but also able to adapt to rapidly evolving demands at a fixed cost. In this paper, we present a multi-agent approach for th...

2012
Justin Johnson Mike Roberts Matt Fisher

Our goal in this project is to implement a machine learning system which learns to play simple 2D video games. More specifically, we focus on the problem of building a system that is capable of learning to play a variety of different games well, rather than trying to build a system that can play a single game perfectly. We begin by encoding individual video frames using features that capture th...

Journal: :Adaptive Behaviour 2005
Peter Stone Richard S. Sutton Gregory Kuhlmann

RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple independent agents learning simultaneously, and long and variable delays in the effects of actions. We describe our application of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level dec...

2001
Sachiyo Arai Katia Sycara

In this paper, we introduce FirstVisit Pro tSharing (FVPS) as a credit assignment procedure, an important issue in classi er systems and reinforcement learning frameworks. FVPS reinforces e ective rules to make an agent acquire stochastic policies that cause it to behave very robustly within uncertain domains, without pre-de ned knowledge or subgoals. We use an internal episodic memory, not onl...

2012
Reinaldo A. C. Bianchi Carlos H. C. Ribeiro Anna Helena Reali Costa

Since finding control policies using Reinforcement Learning (RL) can be very time consuming, in recent years several authors have investigated how to speed up RL algorithms by making improved action selections based on heuristics. In this work we present new theoretical results – convergence and a superior limit for value estimation errors – for the class that encompasses all heuristicsbased al...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه شیراز 1390

در این تحقیق بهبود عملکرد پیل های سوختی با الکترولیت اکسید جامد (sofc) و مبادله کننده پروتون (pemfc) تحت اغتشاشات ولتاژ شبکه با استفاده از کنترلرهای پیشنهادی fuzzy-pi وfuzzy gain scheduling pi انجام شده است. همچنین مدلسازی هوشمند پیل های سوختی مذکور با استفاده از روش های cvr و lolimot انجام شده است. در ابتدا با استفاده از الگوریتم پیشنهادی در این پایان نامه، مدلسازی هوشمند پیل های سوختی مذکور...

Journal: : 2023

ادغام داده‌ها بین حسگرهای مختلف می‌تواند موجب استخراج اطلاعات با دقت و کیفیت بالاتر گردد بهبود تشخیص تهدیدهای هسته‌ای را به همراه داشته باشد. در این تحقیق، ردیابی چشمه متحرک استفاده از تلفیق داده­های سیستم آشکارساز پرتوی دوربین نظارتی مورد مطالعه قرار گرفت است. بدین منظور الگوریتمی جهت ایجاد همبستگی تصاویر دریافتی توسط شمارش طراحی شده است تا مسیر حرکت جسمی که بیش­ترین ثبت آشکارسازی دارد عنوان ا...

2012
Matthew Adams Robert Loftin Matthew E. Taylor Michael Littman David Roberts

We present an empirical survey of reinforcement learning techniques and relate these techniques to concepts from behaviorism, a field of psychology concerned with the learning process. Specifically, we examine two standard RL algorithms, model-free SARSA, and model-based R-MAX, when used with various shaping techniques. We consider multiple techniques for incorporating shaping into these algori...

2004
Fernando Lozano Jaime Lozano Mario García

In this paper, we employ techniques from artificial intelligence such as reinforcement learning and agent based modeling as building blocks of a computational model for an economy based on conventions. First we model the interaction among firms in the private sector. These firms behave in an information environment based on conventions, meaning that a firm is likely to behave as its neighbors i...

2013
Fabrice Lauri Nicolas Gaud Stéphane Galland Vincent Hilaire

This article presents an overview on Ipseity, an open-source rich-client platform developed in C++ with the Qt framework. Ipseity facilitates the synthesis of artificial cognitive systems in multi-agent systems. The current version of the platform includes a set of plugins based on the classical reinforcement learning techniques like Q-Learning and Sarsa. Ipseity is targeted at a broad range of...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید