Please use this identifier to cite or link to this item:
Title: RHC: a non-parametric cluster-based data reduction for efficient k-NN classification
Authors: Ougiaroglou, Stefanos
Evangelidis, Georgios
Type: Article
Subjects: FRASCATI::Engineering and technology
Issue Date: Feb-2016
Source: Pattern Analysis and Applications
Volume: 19
Issue: 1
First Page: 93
Last Page: 109
Abstract: Although the k -NN classifier is a popular classification method, it suffers from the high computational cost and storage requirements it involves. This paper proposes two effective cluster-based data reduction algorithms for efficient k -NN classification. Both have low preprocessing cost and can achieve high data reduction rates while maintaining k -NN classification accuracy at high levels. The first proposed algorithm is called reduction through homogeneous clusters (RHC) and is based on a fast preprocessing clustering procedure that creates homogeneous clusters. The centroids of these clusters constitute the reduced training set. The second proposed algorithm is a dynamic version of RHC that retains all its properties and, in addition, it can manage datasets that cannot fit in main memory and is appropriate for dynamic environments where new training data are gradually available. Experimental results, based on fourteen datasets, illustrate that both algorithms are faster and achieve higher reduction rates than four known methods, while maintaining high classification accuracy.
ISSN: 1433-7541
Other Identifiers: 10.1007/s10044-014-0393-7
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
PAA.pdf811,18 kBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.