HDO and aHDO, new data reorganization techniques to improve compression of bitmap indices
- 总结
- Lead Inventors: John R. Kender Ph.D. & Hassan H. Malik Ph.D.Problem or Unmet Need:Many commercial database systems use Run-Length Encoding (RLE) based compressed bitmap indices (such as Oracle's BBC, or WAH) to execute queries. A compressed bitmap index is a special kind of index that stores the bulk of its data as bitmaps, compressed in a way that allows performing logical operations without first requiring to decompress the bitmaps. Queries are executed in most cases by performing bitwise logical operations on these indices. The storage space and time needed to execute queries depends directly on the sizes of the indices used to execute the query. The sizes of these compressed bitmap indices are driven by the availability of long consecutive 0 or 1 bits in the original uncompressed bitmap (i.e., a large number of consecutive 0 bits can be compressed in a single "run", same for 1 bits). If a large number of consecutive 0 or 1 bits are not found, the indices consume a lot more space and the queries take more time to execute. The default ordering of transactions may not be optimal and may not result in smallest possible index bitmaps. Reorganizing bitmaps to reduce the number of bit shifts, and thereby increase compressibility (via RLE), would directly improve storage efficiency and query execution performance. As databases power many transactions over the internet, this boost in efficiency would be a welcome addition to database product lines. This technology is an algorithm to improve the compressibility of bitmap indices. It focuses on maximizing the overall similarity of neighboring rows, instead of processing on a column-by-column basis as most methods do. It makes use of Hamming distances, which can be computed efficiently. The first algorithm, HDO (Hamming Distance Order) finds the most suitable candidate rows for each position in the final re-ordered dataset, and then applies local tie-breaking criteria to select the best candidate. The second, aHDO, is a linear approximation to HDO and reorders the whole index in a user-defined number of iterations. These steps can be performed as a pre-compression step, which will improve the performance of existing commercial database systems.
- 技术优势
- Substantially decreases the storage required for bitmap indices, provides between 5% and 400% space savings over the original ordered compressed bitmap, resulting in similar query processing performance improvements Provides a fast, linear-time implementation that can be used for large databases Does not require modifying the query execution engine, or the code for compression. Process can be added as a pre-compression plug-in, which will be more palpable to IT managers
- 技术应用
- Provides a storage and performance benefit to databases using RLE compression on bitmap indices Can also be used in other contexts where RLE compression is used, such as images and sound data (not just limited to databases)
- 详细技术说明
- This technology is an algorithm to improve the compressibility of bitmap indices. It focuses on maximizing the overall similarity of neighboring rows, instead of processing on a column-by-column basis as most methods do. It makes use of Hamming...
- *Abstract
-
None
- *Inquiry
- Calvin Chu Columbia Technology Ventures Tel: (212) 854-8444 Email: TechTransfer@columbia.edu
- *IR
- M08-002
- *Principal Investigation
-
- 国家/地区
- 美国

欲了解更多信息,请点击 这里