Quadratic probing
Quadratic probing is an open addressing scheme in computer programming for resolving hash collisions in hash tables. Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found.
An example sequence using quadratic probing is:
Quadratic probing can be a more efficient algorithm in an open addressing table, since it better avoids the clustering problem that can occur with linear probing, although it is not immune. It also provides good memory caching because it preserves some locality of reference; however, linear probing has greater locality and, thus, better cache performance.
Quadratic function
Let h(k) be a hash function that maps an element k to an integer in [0, m−1], where m is the size of the table. Let the ith probe position for a value k be given by the function
where c2 ≠ 0. (If c2 = 0, then h(k,i) degrades to a linear probe. For a given hash table, the values of c1 and c2 remain constant.
Examples:
- If , then the probe sequence will be
- For m = 2n, a good choice for the constants are c1 = c2 = 1/2, as the values of h(k,i) for i in [0, m−1] are all distinct. This leads to a probe sequence of (the triangular numbers) where the values increase by 1, 2, 3, ...
- For prime m > 2, most choices of c1 and c2 will make h(k,i) distinct for i in [0, (m−1)/2]. Such choices include c1 = c2 = 1/2, c1 = c2 = 1, and c1 = 0, c2 = 1. However, there are only m/2 distinct probes for a given element, requiring other techniques to guarantee that insertions will succeed when the load factor is exceeds 1/2.
Limitations
When using quadratic probing, however (with the exception of triangular number cases for a hash table of size ),[1] there is no guarantee of finding an empty cell once the table becomes more than half full, or even before this if the table size is composite,[2] because collisions must be resolved using half of the table at most.
The inverse of this can be proven as such: Suppose a hash table has size (a prime greater than 3), with an initial location and two alternative locations and (where and ). If these two locations point to the same key space, but , then
- .
As is a prime number, either or must be divisible by . Since and are different (modulo ), , and since both variables are greater than zero, . Thus, by contradiction, the first alternative locations after must be unique, and subsequently, an empty space can always be found so long as at most locations are filled (i.e., the hash table is not more than half full).
Alternating signs
If the sign of the offset is alternated (e.g. +1, −4, +9, −16, etc.), and if the number of buckets is a prime number congruent to 3 modulo 4 (e.g. 3, 7, 11, 19, 23, 31, etc.), then the first offsets will be unique (modulo ). In other words, a permutation of 0 through is obtained, and, consequently, a free bucket will always be found as long as at least one exists.
References
- Hopgood, F. Robert A.; Davenport, James H. (November 1972). "The Quadratic Hash Method when the table size is a power of 2". Computer Journal. 15 (4): 314–5. doi:10.1093/comjnl/15.4.314. Retrieved 2020-02-07.
- Weiss, Mark Allen (2009). "§5.4.2 Quadratic probing". Data Structures and Algorithm Analysis in C++. Pearson Education. ISBN 978-81-317-1474-4.