A reinforcement learning—Variable neighborhood search method for the capacitated Vehicle Routing Problem

Kalatzantonakis, Panagiotis; Sifaleras, Angelo; Samaras, Nikolaos

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: https://ruomo.lib.uom.gr/handle/7000/1279

Πλήρης εγγραφή μεταδεδομένων

Πεδίο DC	Τιμή	Γλώσσα
dc.contributor.author	Kalatzantonakis, Panagiotis	-
dc.contributor.author	Sifaleras, Angelo	-
dc.contributor.author	Samaras, Nikolaos	-
dc.date.accessioned	2022-09-19T07:20:32Z	-
dc.date.available	2022-09-19T07:20:32Z	-
dc.date.issued	2022	-
dc.identifier	10.1016/j.eswa.2022.118812	en_US
dc.identifier.issn	0957-4174	en_US
dc.identifier.uri	https://doi.org/10.1016/j.eswa.2022.118812	en_US
dc.identifier.uri	https://ruomo.lib.uom.gr/handle/7000/1279	-
dc.description.abstract	Finding the best sequence of local search operators that yield the optimal performance of Variable Neighborhood Search is an important open research question in the field of metaheuristics. This paper proposes a Reinforcement Learning method to address this question. We introduce a new hyperheuristic scheme, termed Bandit VNS, inspired by the Multi-armed Bandit, a particular type of a single state reinforcement learning problem. In Bandit VNS, we utilize the General Variable Neighborhood Search metaheuristic and enhance it by a hyperheuristic strategy. We examine several variations of the Upper Confidence Bound algorithm to create a reliable strategy for adaptive neighborhood selection. Furthermore, we utilize Adaptive Windowing, a state of the art algorithm to estimate and detect changes in the data stream. Bandit VNS is designed for effective parallelization and encourages cooperation between agents to produce the best solution quality. We demonstrate this concept's advantages in accuracy and speed by extensive experimentation using the Capacitated Vehicle Routing Problem. We compare the novel scheme's performance against the conventional General Variable Neighborhood Search metaheuristic in terms of the CPU time and solution quality. The Bandit VNS method shows excellent results and reaches significantly higher performance metrics when applied to well-known benchmark instances. Our experiments show that, our approach achieves an improvement of more than 25% in solution quality when compared to the General Variable Neighborhood Search method using standard library instances of medium and large size.	en_US
dc.language.iso	en	en_US
dc.publisher	Elsevier	en_US
dc.source	Expert Systems with Applications	en_US
dc.subject	FRASCATI::Natural sciences::Mathematics::Applied Mathematics	en_US
dc.subject	FRASCATI::Natural sciences::Computer and information sciences	en_US
dc.subject.other	Reinforcement Learning	en_US
dc.subject.other	Multi-Armed Bandits	en_US
dc.subject.other	Intelligent Optimization	en_US
dc.subject.other	Bandit Learning	en_US
dc.subject.other	Metaheuristics	en_US
dc.subject.other	Variable Neighborhood Search	en_US
dc.subject.other	Vehicle Routing Problem	en_US
dc.title	A reinforcement learning—Variable neighborhood search method for the capacitated Vehicle Routing Problem	en_US
dc.type	Article	en_US
dc.contributor.department	Τμήμα Εφαρμοσμένης Πληροφορικής	en_US
local.identifier.firstpage	118812	en_US
Εμφανίζεται στις Συλλογές:	Τμήμα Εφαρμοσμένης Πληροφορικής

Αρχεία σε αυτό το Τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
A_Reinforcement_Learning_-_VNS_Method_for_the_CVRP.pdf Until 2024-09-16		1,36 MB	Adobe PDF	Προβολή/Ανοιγμα Αίτηση αντιτύπου

Εμφανίστε την απλή εγγραφή

Τα τεκμήρια στο Αποθετήριο προστατεύονται από πνευματικά δικαιώματα, εκτός αν αναφέρεται κάτι διαφορετικό.

Ιδρυματικό Αποθετήριο Ακαδημαϊκής Έρευνας Πανεπιστήμιο Μακεδονίας

Ιδρυματικό Αποθετήριο Ακαδημαϊκής Έρευνας
Πανεπιστήμιο Μακεδονίας