Improving Data Reduction by Merging Prototypes

Ponos, Pavlos; Ougiaroglou, Stefanos; Evangelidis, Georgios

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: https://ruomo.lib.uom.gr/handle/7000/1169

Τίτλος:	Improving Data Reduction by Merging Prototypes
Συγγραφείς:	Ponos, Pavlos Ougiaroglou, Stefanos Evangelidis, Georgios
Τύπος:	Conference Paper
Θέματα:	FRASCATI::Natural sciences::Computer and information sciences
Λέξεις-Κλειδιά:	k-NN classification Data reduction Prototype merging Data streams Clustering
Ημερομηνία Έκδοσης:	13-Αυγ-2019
Τόμος:	11695
Πρώτη Σελίδα:	20
Τελευταία Σελίδα:	32
Τίτλος Τόμου:	Advances in Databases and Information Systems
Μέρος Σειράς:	Lecture Notes in Computer Science
Μέρος Σειράς:	Lecture Notes in Computer Science
Επιτομή:	A well-known and adaptable classifier is the k-Nearest Neighbor (kNN) that requires a training set of relatively small size in order to perform adequately. Training sets can be reduced in size by using conventional data reduction techniques. Unfortunately, these techniques are inappropriate in streaming environments or when executed in devices with limited resources. dRHC is a prototype generation algorithm that works in streaming environments by maintaining a condensed training set that can be updated by continuously arriving training data segments. Prototypes in dRHC carry an appropriate weight to indicate the number of instances of the same class that they represent. dRHC2 is an improvement over dRHC since it can handle fixed size condensing sets by removing the least important prototypes whenever the condensing set exceeds a predefined size. In this paper, we exploit the idea that dRHC or dRHC2 prototypes could be merged whenever they are close enough and represent the same class. Hence, we propose two new prototype merging algorithms. The first algorithm performs a single pass over a newly updated condensing set and merges all prototype pairs of the same class under the condition that each prototype is the nearest neighbor of the other. The second algorithm performs repetitive merging passes until there are no prototypes to be merged. The proposed algorithms are tested against several datasets and the experimental results reveal that the single pass variation performs better for both dRHC and dRHC2 taking into account the trade-off between preprocessing cost, reduction rate and accuracy. In addition, the merging appears to be more appropriate for the static version of the algorithm (dRHC) since it offers higher data reduction without sacrificing accuracy.
URI:	https://doi.org/10.1007/978-3-030-28730-6_2 https://ruomo.lib.uom.gr/handle/7000/1169
ISBN:	978-3-030-28729-0 978-3-030-28730-6
ISSN:	0302-9743 1611-3349
Αλλοι Προσδιοριστές:	10.1007/978-3-030-28730-6_2
Εμφανίζεται στις Συλλογές:	Τμήμα Εφαρμοσμένης Πληροφορικής

Αρχεία σε αυτό το Τεκμήριο:

Αρχείο	Περιγραφή	Μέγεθος	Μορφότυπος
2019_ADBIS_POE.pdf		295,74 kB	Adobe PDF	Προβολή/Ανοιγμα

Εμφανίστε την πλήρη εγγραφή

Αυτό το τεκμήριο προστατεύεται από Αδεια Creative Commons

Ιδρυματικό Αποθετήριο Ακαδημαϊκής Έρευνας Πανεπιστήμιο Μακεδονίας

Ιδρυματικό Αποθετήριο Ακαδημαϊκής Έρευνας
Πανεπιστήμιο Μακεδονίας