Application of the LSPI reinforcement learning technique to co-located network negotiation
نویسنده
چکیده
Optimizing multiple co-located networks, each with a variable number of network functionalities that influence each other, is a complex problem that has not yet received a lot of attention in the research community. However, since independent co-located networks increasingly influence each other, optimization solutions can no longer afford to look only at the performance of a single network. To this end, we propose a multi-tiered solution, based on Least Square Policy Improvement (LSPI), a machine learning technique.
منابع مشابه
A reinforcement learning based solution for cognitive network cooperation between co-located, heterogeneous wireless sensor networks
Due to a drastic increase of the number of wireless communication devices, these devices are forced to interfere or interact with each other. This raises the issue of possible effects this coexistence might have on the performance of each of the networks. Negative effects are a consequence of contention for network resources (such as free wireless communication frequencies) between different de...
متن کاملPolicy Iteration for Learning an Exercise Policy for American Options
Options are important financial instruments, whose prices are usually determined by computational methods. Computational finance is a compelling application area for reinforcement learning research, where hard sequential decision making problems abound and have great practical significance. In this paper, we investigate reinforcement learning methods, in particular, least squares policy iterati...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملOn Suitability of the Reinforcement Learning Methodology in Dynamic, Heterogeneous, Self-optimizing Networks
An ever growing number of deployed wireless networks dictates a tempo with which the inter-network cooperation techniques are being developed. Cooperation, in this sense, can go far beyond a simple activation of an interference avoidance techniques. This paper describes and evaluates the performance of a reinforcement learning based reasoning engine, used in a selflearning, cognitively controll...
متن کاملLeast-Squares Policy Iteration
We propose a new approach to reinforcement learning for control problems which combines value-function approximation with linear architectures and approximate policy iteration. This new approach is motivated by the least-squares temporal-difference learning algorithm (LSTD) for prediction problems, which is known for its efficient use of sample experiences compared to pure temporal-difference a...
متن کامل