Please use this identifier to cite or link to this item:
Title: Compact Binary: an Efficient Non-parameterized Code for Index Compression
Authors: Nitsos, Ilias
Evangelidis, Georgios
Dervos, Dimitris A.
Type: Conference Paper
Subjects: FRASCATI::Natural sciences::Computer and information sciences
Issue Date: 2003
First Page: 255
Last Page: 266
Volume Title: Proceedings of the 1st Balkan Conference in Informatics (BCI), Thessaloniki
Abstract: Inverted file indexes are nowadays the most popular method for indexing text databases. Integer number compression codes are applied on theinverted document id lists to produce compact inverted file indexes. A class of index compression codes that are insensitive to the variations in the statistics of dynamic text collections are the non-parameterized codes. In the present study, we introduce compact-binary (cb): a new non-parameterized coding scheme that combines the Golomb code and the binary representation of integers. The performance of the new code is compared to that of existing popular codes. Experimental results obtained from a number of TREC document collections reveal an overall 7,7% improvement over the most efficient of the existing nonparameterized codes. The outcome is backed by analysis and comprises a significant gain when one considers the large sizes of the target text database collections.
Appears in Collections:Department of Applied Informatics

Files in This Item:
File Description SizeFormat 
2003_BCI_Nitsos.pdf158,87 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons