Please use this identifier to cite or link to this item:
https://ruomo.lib.uom.gr/handle/7000/1171
Title: | The Effect of Parallelism on Data Reduction |
Authors: | Ponos, Pavlos Ougiaroglou, Stefanos Evangelidis, Georgios |
Type: | Conference Paper |
Subjects: | FRASCATI::Natural sciences::Computer and information sciences |
Keywords: | k-NN Classification Data Reduction Prototype Merging Parallel Implementation Clustering |
Issue Date: | 26-Sep-2019 |
First Page: | 1 |
Last Page: | 4 |
Volume Title: | Proceedings of the 9th Balkan Conference on Informatics |
Abstract: | In this paper, we investigate the effect of parallelism on two data reduction algorithms that use k-Means clustering in order to find homogeneous clusters in the training set. By homogeneous, we refer to clusters where all instances belong to the same class label. Our approach divides the training set into subsets and applies the data reduction algorithm on each separate subset in parallel. Then, the reduced subsets are merged back to the final reduced set. In our experimental study, we split the datasets into 8, 16, 32 and 64 subsets. The results obtained reveal that parallelism can achieve very low preprocessing costs. Also, when the number of subsets is high, in some datasets the accuracy of k-NN classification is almost equal (if not better) to the one achieved when using the standard execution of the reduction algorithms, with a small loss in the reduction rate. |
URI: | https://doi.org/10.1145/3351556.3351584 https://ruomo.lib.uom.gr/handle/7000/1171 |
ISBN: | 9781450371933 |
Other Identifiers: | 10.1145/3351556.3351584 |
Appears in Collections: | Department of Applied Informatics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2019_BCI_POE.pdf | 594,39 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License