Linear Probing, Quadratic Probing and Double Hashing

The three terms that make up the title of this article are the three common techniques used for computing hash sequences. That is what are are going to cover today

  1. Linear Probing
  2. Quadratic Probing
  3. Double Hashing

Introduction to Hashing

1. Linear Probing

We start with a normal has function h that maps the universe of keys U into slots in the hash table T such that

h’ : U → {0, 1, 2, . . . , m-1}

h’ is a normal hash function which we would call the auxiliary hash function.  Now if we use linear probing, we would have a hash function like this:

h(k, i) = (h'(k) + i ) mod m

for m = 0, 1, 2, . . .,m-1

Given a particular key k, the first step is to examine T[h'(k)]  which is the  slot given by the auxiliary hash function. Next, we examine slot T[h'(k) +1], then we examine T[h'(k) +2] and so on up to the last slot which is T[m-1]. The wrap around back to the beginning of the table to position T[0], T[1] and so on until T[h'(k) -1].  This is presented in the table below:

  • T[h'(k)]
  • T[h'(k) + 1]
  • T[h'(k) + 2]
  • T[m-1]
  • T[0]
  • T[1]
  • T[h'(k) – 1]

It is therefore guaranteed that there will be m distinct probe sequences since the initial probe determines the whole probe sequence.

Linear Probing have the advantage of being easy to implement but has one draw back. And that is a problem known as primary clustering. This is a situation where long runs of positions build up. This in turn leads to increased average search time.

 

2. Quadratic Probing

Quadratic Probing is similar to linear probing but in quadratic probing the hash function used is of the form:

h(k, i) = (h'(k) +  c1i  + c2i2) mod m

where h’ is the auxiliary hash function and c1 and c2 are called positive auxiliary constants.

i = 0, 1, 2, . . . , m-1

Just as with linear probing, the initial probe position is T[h'(k)]; Subsequent positions probed are not linear but are offset by  an amount that depends on the quadratic nature of the probe number i.

The method of quadratic probing is found to be better than linear probing. However, to ensure that the full hash table is covered, the values of c1, and c2 are constrained. It may happen that two keys produce the same probe sequence such that:

h(k1, i) = h(k2, i)

Therefore, this leads to a kind of clustering called secondary clustering. Just as in linear probing, the initial probe position determines the entire probe sequence.

 

3. Double Hashing

Double Hashing is considered to be the best method of hashing for open addressing compared to linear and quadratic probing. In this case, two auxiliary functions h1 and h2 are used.  Hash function for double hashing take the form:

h(k, i) = (h1(k) + ih2(k)) mod m

h1 and h2 are the auxiliary functions.

Just like before, the initial probe position is T[h1(k)]. Subsequent hash positions are offset from previous position by the amount h2(k) modulo m. So we can see that the probe sequence depends on the key in two ways. This is because the initial probe position, the offset or both may vary.

However, value of h2(k) need to be relatively prime. for the whole hash table to be searched. One way to achieve this is to let m be a power of 2 and then to design h2 such that it will always produce and odd number. Another ways is to let me be a prime number and the to design h2 to always returns a positive integer that is less than m.

Take and example

h1(k) = k mod m

h2(k) = 1 + (k mod m’)

where m’ is a little less than m.

Finally, we would not take some time to do and Analysis of Open Addressing with Linear Probing in the next lesson.

Share this with friends