Abstract
One of the fastest search techniques for uniformly distributed sorted numerical tables is interpolation search. This divide and conquer technique accesses the most probable key rather than the middle key as in binary search and continues to search similarly the appropriate part of the table. In a previous work we proved a lg lg n average number of accesses for interpolation search. The inefficiency of interpolation search for an alphabetic table is demonstrated by Burton and Lewis who suggest a robust variation to improve the efficiency. This inefficiency is expected since such tables are usually far from uniform distribution. However, for nonuniformly distributed tables for which the cumulative distribution function F is known, applying F to the keys yields uniform distribution for which interpolation search is very fast. In arithmetic coding a string of characters is mapped into the [0,1) interval according to the probabilities of its characters. We found that this transformation, designed for data compression, is actually the cumulative distribution function F for alphabetic tables. Experiments confirm that interpolation search on alphabetic tables, applying arithmetic coding to the character-strings in a sophisticated way, show a performance very close to lg lg n accesses. Hence, we design a new fast search technique for alphabetic tablets.
Original language | English (US) |
---|---|
Pages (from-to) | 493-499 |
Number of pages | 7 |
Journal | IEEE Transactions on Computers |
Volume | 41 |
Issue number | 4 |
DOIs | |
State | Published - Apr 1992 |
All Science Journal Classification (ASJC) codes
- Software
- Theoretical Computer Science
- Hardware and Architecture
- Computational Theory and Mathematics
Keywords
- Alphabetic tables
- arithmetic coding
- interpolation
- modified interpolation search
- nonuniformly distributed tables
- search