Please use this identifier to cite or link to this item:
Title: EHC: Non-parametric Editing by Finding Homogeneous Clusters
Authors: Ougiaroglou, Stefanos
Evangelidis, Georgios
Type: Conference Paper
Subjects: FRASCATI::Natural sciences::Computer and information sciences
Keywords: k-NN classification
noisy items
Issue Date: 2014
Volume: 8367
First Page: 290
Last Page: 304
Volume Title: Foundations of Information and Knowledge Systems
Part of Series: Lecture Notes in Computer Science
Part of Series: Lecture Notes in Computer Science
Abstract: Editing is a crucial data mining task in the context of k-Nearest Neighbor classification. Its purpose is to improve classification accuracy by improving the quality of training datasets. To obtain such datasets, editing algorithms try to remove noisy and mislabeled data as well as smooth the decision boundaries between the discrete classes. In this paper, a new fast and non-parametric editing algorithm is proposed. It is called Editing through Homogeneous Clusters (EHC) and is based on an iterative execution of a clustering procedure that forms clusters containing items of a specific class only. Contrary to other editing approaches, EHC is independent of input (tuning) parameters. The performance of EHC is experimentally compared to three state-of-the-art editing algorithms on ten datasets. The results show that EHC is faster than its competitors and achieves high classification accuracy.
ISBN: 978-3-319-04938-0
ISSN: 0302-9743
Other Identifiers: 10.1007/978-3-319-04939-7_14
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
2014_FOIKS.pdf163,82 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons