Please use this identifier to cite or link to this item:
Title: Data Node Splitting Policies for Improved Range Query Efficiency in k-dimensional Point Data Indexes
Authors: Outsios, Evangelos
Evangelidis, Georgios
Type: Conference Paper
Subjects: FRASCATI::Natural sciences::Computer and information sciences
Keywords: multi-attribute point data indexes
average storage utilization
space partitioning quality
range query performance
Issue Date: 2011
First Page: 46
Last Page: 50
Volume Title: 2011 15th Panhellenic Conference on Informatics
Abstract: High dimensional vectors (points) are very common in image and video classification, time series data mining, and many modern data mining applications. One of the most popular classification methods on such data is k-Nearest Neighbor (kNN) searching. Unfortunately, all proposed and state-of-the-art multi-attribute indexes fall short in terms of usability as dimensionality increases. This is attributed to the ``dimensionality curse" problem, according to which, range searching above 10 dimensions is as efficient as a sequential scan of the entire database. Thus, kNN searching, as a special case of range searching, has to benefit a lot if we find ways to increase the performance of indexes in high dimensions. In this paper, we deal with space partitioning indexes and we propose six data node splitting techniques. We examine their performance in terms of data node storage utilization and quality of space partitioning. These two conflicting goals are both essential for good range query performance. Our experiments with uniform and skewed data demonstrate that certain splitting techniques can perform satisfactorily.
ISBN: 978-1-61284-962-1
Other Identifiers: 10.1109/PCI.2011.46
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
2011_PCI_OE.pdf266,45 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons