A Reinforcement Learning Approach for Product Delivery by Multiple Vehicles
نویسندگان
چکیده
Real-time delivery of products in the context of stochastic demands and multiple vehicles is a difficult problem, as it requires the joint investigation of the problems in inventory control and vehicle routing. We model this problem in the framework of Average-reward Reinforcement Learning (ARL) and present experimental results on a modelbased ARL algorithm called H-Learning with piecewise linear function approximation. We also show that a version of hill climbing yields at least as good performance as exhaustive search with an order of magnitude speedup in execution time.
منابع مشابه
A particle swarm optimization method for periodic vehicle routing problem with pickup and delivery in transportation
In this article, multiple-product PVRP with pickup and delivery that is used widely in goods distribution or other service companies, especially by railways, was introduced. A mathematical formulation was provided for this problem. Each product had a set of vehicles which could carry the product and pickup and delivery could simultaneously occur. To solve the problem, two meta-heuristic methods...
متن کاملA multi-product vehicle routing scheduling model with time window constraints for cross docking system under uncertainty: A fuzzy possibilistic-stochastic programming
Mathematical modeling of supply chain operations has proven to be one of the most complex tasks in the field of operations management and operations research. Despite the abundance of several modeling proposals in the literature; for vast majority of them, no effective universal application is conceived. This issue renders the proposed mathematical models inapplicable due largely to the fact th...
متن کاملLow-Area/Low-Power CMOS Op-Amps Design Based on Total Optimality Index Using Reinforcement Learning Approach
This paper presents the application of reinforcement learning in automatic analog IC design. In this work, the Multi-Objective approach by Learning Automata is evaluated for accommodating required functionalities and performance specifications considering optimal minimizing of MOSFETs area and power consumption for two famous CMOS op-amps. The results show the ability of the proposed method to ...
متن کاملAerial Suspended Cargo Delivery through Reinforcement Learning
Cargo-bearing Unmanned aerial vehicles (UAVs) have tremendous potential to assist humans in food, medicine, and supply deliveries. For time-critical cargo delivery tasks, UAVs need to be able to navigate their environments and deliver suspended payloads with bounded load displacement. As a constraint balancing task for joint UAV-suspended load system dynamics, this task poses a challenge. This ...
متن کاملAn Electronic Marketplace Based on Reputation and Learning
In this paper, we propose a market model which is based on reputation and reinforcement learning algorithms for buying and selling agents. Three important factors: quality, price and delivery-time are considered in the model. We take into account the fact that buying agents can have different priorities on quality, price and delivery-time of their goods and selling agents adjust their bids acco...
متن کامل