universal approximator

MICROVASCULAR ANASTOMOSIS: A LABORATORY DEVICE FOR HOLDING STAY SUTURES AND A NEW APPROXIMATOR CLAMP

Journal: :Volume 3 Issue 2 Apr - Jun 2018 2018

What about Inputting Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2022

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation. Such extension enables PeVFA preserve values of multiple policies at same time and brings appealing characteristic, i.e., generalization among policie...

متن کامل

Computer Science Technical Report Decision Tree Function Approximation in Reinforcement Learning

1998

Larry D. Pyeatt Adele E. Howe

We present a decision tree based approach to function approximation in reinforcement learning. We compare our approach with table lookup and a neural network function approximator on three problems: the well known mountain car and pole balance problems as well as a simulated automobile race car. We find that the decision tree can provide better learning performance than the neural network funct...

متن کامل

Internet-based teleoperation: A case study - toward delay approximation and speed limit module

2007

Shengtong Zhong Philippe Le Parc Jean Vareille

This paper presents the internet-based remote control of mobile robot. To face unpredictable Internet delays and possible connection rupture, a direct teleoperation architecture with “Speed Limit Module” (SLM) and “Delay Approximator” (DA) is proposed. This direct control architecture guarantees the path error of the robot motion is restricted within the path error tolerance of the application....

متن کامل

Input Selection and Regression using the SOM

2005

Francesco Corona Amaury Lendasse

This paper presents a global methodology to build a nonlinear regression when the number of available samples is small compared to the number of inputs. The task is divided in two parts: selection of the best inputs and construction of the approximator. A first SOM is used to compute clean correlations between the inputs and the output. A second SOM is built to link the output to the selected i...

متن کامل

Least-squares temporal difference learning based on extreme learning machine

2013

Pablo Escandell-Montero José María Martínez-Martínez José David Martín-Guerrero Emilio Soria-Olivas Juan Gómez-Sanchís

This paper proposes a least-squares temporal difference (LSTD) algorithm based on extreme learning machine that uses a singlehidden layer feedforward network to approximate the value function. While LSTD is typically combined with local function approximators, the proposed approach uses a global approximator that allows better scalability properties. The results of the experiments carried out o...

متن کامل

Pruned lazy learning models for time series prediction

2005

Antti Sorjamaa Amaury Lendasse Michel Verleysen

This paper presents two improvements of Lazy Learning. Both methods include input selection and are applied to long-term prediction of time series. First method is based on an iterative pruning of the inputs and the second one is performing a brute force search in the possible set of inputs using a k-NN approximator. Two benchmarks are used to illustrate the efficiency of these two methods: the...

متن کامل

Internet-based Teleoperation: a Case Study

2007

Shengtong Zhong Philippe Le Parc Jean Vareille

This paper presents the internet-based remote control of mobile robot. To face unpredictable Internet delays and possible connection rupture, a direct teleoperation architecture with “Speed Limit Module” (SLM) and “Delay Approximator” (DA) is proposed. This direct control architecture guarantees the path error of the robot motion is restricted within the path error tolerance of the application....

متن کامل

Control of Inverted Double Pendulum using Reinforcement Learning

2016

Fredrik Gustafsson

In this project, we apply reinforcement learning techniques to control an inverted double pendulum on a cart. We successfully learn a controller for balancing in a simulation environment using Qlearning with a linear function approximator, without any prior knowledge of the system at hand. We do however fail to learn a controller for the swingup maneuver, which leads to a discussion on what mig...

متن کامل

Direct Estimation of Wrist Joint Angular Velocities from Surface EMGs by Using an SDNN Function Approximator

2016

Kazumasa Horie Atsuo Suemitsu Tomohiro Tanno Masahiko Morita

The present paper proposes a method for estimating joint angular velocities from multi-channel surface electromyogram (sEMG) signals. This method uses a selective desensitization neural network (SDNN) as a function approximator that learns the relation between integrated sEMG signals and instantaneous joint angular velocities. A comparison experiment with a Kalman filter model shows that this m...

متن کامل