Please use this identifier to cite or link to this item:
https://ruomo.lib.uom.gr/handle/7000/1582
Title: | Very fast variations of training set size reduction algorithms for instance-based classification |
Authors: | Ougiaroglou, Stefanos Evangelidis, Georgios |
Type: | Conference Paper |
Subjects: | FRASCATI::Natural sciences::Computer and information sciences |
Keywords: | data reduction prototype generation RHC homogeneous clusters k-NN Classification |
Issue Date: | May-2023 |
Publisher: | Association for Computing Machinery |
First Page: | 64 |
Last Page: | 70 |
Volume Title: | International Database Engineered Applications Symposium Conference |
Abstract: | Reduction through Homogeneous Clustering (RHC) and its editing variant (ERHC) are effective data reduction techniques for the k-NN classifier. They are based on an iterative k-means clustering task that discovers homogeneous clusters. The centers of the resulting homogeneous clusters constitute the instances of the reduced training set. Although RHC and ERHC are quite fast compared to several well-known data reduction techniques, the iterative execution of k-means clustering renders both of them inappropriate for data reduction tasks that need to be performed quickly, especially, when run over large training datasets. The present paper proposes simple and very fast variations of the algorithms, which are appropriate for such environments. The variations are called RHC2 and ERHC2 and replace the complete execution of k-means clustering with a fast task that assigns instances to the class centers. The experimental study based on fourteen datasets, and, the corresponding statistical tests, show that the proposed RHC2 and ERHC2 variations are very fast and, at the cost of a small penalty on classification accuracy, they achieve higher reduction rates than their predecessors and other two well-known data reduction techniques. They are good candidates when fast reduction on large datasets is required. |
URI: | https://doi.org/10.1145/3589462.3589493 https://ruomo.lib.uom.gr/handle/7000/1582 |
ISBN: | 9798400707445 |
Other Identifiers: | 10.1145/3589462.3589493 |
Appears in Collections: | Department of Applied Informatics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
IDEAS2023_RHC2.pdf | 168,42 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License