Please use this identifier to cite or link to this item:
Title: Efficient dataset size reduction by finding homogeneous clusters
Authors: Ougiaroglou, Stefanos
Evangelidis, Georgios
Type: Conference Paper
Subjects: FRASCATI::Natural sciences::Computer and information sciences
Issue Date: 2012
First Page: 168
Volume Title: Proceedings of the Fifth Balkan Conference in Informatics on - BCI '12
Abstract: Although the k-Nearest Neighbor classifier is one of the most widely-used classification methods, it suffers from the high computational cost and storage requirements it involves. These major drawbacks have constituted an active research field over the last decades. This paper proposes an effective data reduction algorithm that has low preprocessing cost and reduces storage requirements while maintaining classification accuracy at an acceptable high level. The proposed algorithm is based on a fast pre-processing clustering procedure that creates homogeneous clusters. The centroids of these clusters constitute the reduced training-set. Experimental results, based on real-life datasets, illustrate that the proposed algorithm is faster and achieves higher reduction rates than three known existing methods, while it does not significantly reduce the classification accuracy.
ISBN: 9781450312400
Other Identifiers: 10.1145/2371316.2371349
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
2012_BCI.pdf86,78 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons