Please use this identifier to cite or link to this item:
|Title:||RHC: a non-parametric cluster-based data reduction for efficient k-NN classification|
|Subjects:||FRASCATI::Engineering and technology|
|Source:||Pattern Analysis and Applications|
|Abstract:||Although the k -NN classifier is a popular classification method, it suffers from the high computational cost and storage requirements it involves. This paper proposes two effective cluster-based data reduction algorithms for efficient k -NN classification. Both have low preprocessing cost and can achieve high data reduction rates while maintaining k -NN classification accuracy at high levels. The first proposed algorithm is called reduction through homogeneous clusters (RHC) and is based on a fast preprocessing clustering procedure that creates homogeneous clusters. The centroids of these clusters constitute the reduced training set. The second proposed algorithm is a dynamic version of RHC that retains all its properties and, in addition, it can manage datasets that cannot fit in main memory and is appropriate for dynamic environments where new training data are gradually available. Experimental results, based on fourteen datasets, illustrate that both algorithms are faster and achieve higher reduction rates than four known methods, while maintaining high classification accuracy.|
|Appears in Collections:||Department of Applied Informatics |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.