Please use this identifier to cite or link to this item:
https://ruomo.lib.uom.gr/handle/7000/1227
Title: | Efficient indexing methods in the data mining context |
Authors: | Kouiroukidis, Nikolaos Evangelidis, Georgios |
Type: | Article |
Subjects: | FRASCATI::Natural sciences::Computer and information sciences |
Keywords: | Multidimensional Indexing Data Mining |
Issue Date: | 2011 |
Source: | International Journal on Integrated Information Management |
Abstract: | The proliferation of applications manipulating enormous sizes of multidimensional vector data, that require indexing in order to support operations like kNN searching in an efficient manner, has revived the interest in multidimensional indexing. It is well established that as the dimensionality of data increases the performance of queries such as range queries and nearest neighbor (1NN or kNN) queries decreases leading to a problem described as “the curse of dimensionality”. In this paper we point out problems that arise when indexing multidimensional data in fixed dimensions, such as 8 or 16, because, usually, when dealing with data of higher dimensionality it is common to first apply dimensionality reduction techniques such as principal component analysis. Although there is a plethora of research papers proposing multidimensional indexing structures, most of them report empirical results with relatively small and low dimensionality datasets. We attempt a fair comparison of many state of the art indexing structures designed exclusively to index multi-dimensional points like the Hybrid tree, the iDistance and the P+tree. We include in our comparison the R*tree, a state of the art index designed both for multidimensional points and regions. It is an improvement of the well-known R-tree, and also has been revised and improved further recently. We compare the behavior of the indexes on kNN queries (for k=1 and k=10) with varying dataset sizes and dimensionality. Our goal is to determine the structure(s) that could be used efficiently in the area of data mining. We obtained the source code of the implementations of the related structures from the authors and the R-tree portal. |
URI: | https://doi.org/10.15556/IJIIM.01.01.002 http://ejournals.uniwa.gr/index.php/JIIM/article/view/3049 https://ruomo.lib.uom.gr/handle/7000/1227 |
ISSN: | 2241-827X |
Other Identifiers: | 10.15556/IJIIM.01.01.002 |
Appears in Collections: | Department of Applied Informatics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2011_IJIIM_kouirukidis.pdf | 160,49 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License