A Hash Table is an arraybased data structure that offers fast insertion and searching of elements. It organizes data so that you can uniquely identify a specific object from a group of similar objects. It is limited in size because it is based on array.
For example, in schools, each student is assigned a unique roll number that can be used to retrieve information about them. So, here the students are hashed to a unique number.
If you need only the basic set operations like add, search, delete, you can implement Hash Table data structure. It uses hash function to convert the key into the index of an array element, where the associated value is to be sought.
So, key values are assigned to elements in a Hash Table using a Hash Function.
A Hash Function helps calculate the best index an item should go in. The index must be small enough for the array size so that it stays within the constraints of the array size. It canâ€™t overwrite other data in the Hash Table.
A Hash Functions basic job is to store values in an array with a limited size. It does it in a way that the array doesnâ€™t need to be searched through whenever the user decides to go and try to find that information in the Hash Table. This allows you to enter values in any order and be able to find them using a calculation instead of searching through the array. That is why it is so fast.
So, Hash Function is a very important part of hash table design and provides uniform distribution of hash values.
So, the information in a hash table has two main components key and value.
For example, let us take the key as my name and value as my phone number,
Key: Abhinab
Value: Phone#
So, we are going to map the key to the value. At the heart of the Hash Table, we are going to have an array like the one stated above.
Now, we are going to write a Hash Function. What this Hash Function will do is, it will look at certain key and will evaluate the key. Then, it will spit out some sort of index number and tell us what location in the array will store the information.
Hash(Key) > index
Hash(Abhinab) > 3
Here, the Hash Function evaluated the key(my name) and stored the information in index 3. So, every time you enter the same key, it is going to spit out the same index number.
If we represent keys to their values in terms of symbols, we get something like this
Here, you can see that Ashu key has the same index number as Abhinab key. So, there is a collision here. I have created a link here to put the Ashu key in index 3. This is called chaining. So, if we get collision where many keys get the same index number, then we can represent it by using chaining.
Collisions are practically unavoidable. Due to collisions, keys are stored also stored in the table and one can distinguish between keyvalue pairs having the same hash. There are various ways to resolve collision. So, there are two different strategies:
For a good hashing mechanism, you should have a good hash function with the following basic requirements:
For example, assume that you have to store strings in the hash table by using hashing techniques {â€œabcdeâ€?, â€œbcdefaâ€?, â€œcdefabâ€?, â€œdefabcâ€?}.
Let us take a hash function that states the following:
The index for a specified string will be equal to the sum of the ASCII values of the characters modulo 599.
The ASCII values of a, b, c, d, e, and f are 97, 98, 99, 100, 101, and 102 respectively. Since all the strings contain the same characters with different permutations, the sum will 599.
As the index of all the strings is the same, we can create a list on that index and insert all the strings in that list.
Now, let us take a different hash function that states:
The index for a specific string will be equal to sum of ASCII values of characters multiplied by their respective order in the string after which it is modulo with 2069.
String Hash Function Index
abcdef (971+982+1004+1015+1026)%2069 38
bcdefa (981+992+1003+1014+1025+976)%2069 23
cdefab (991+1002+1013+1024+975+986)%2069 14
defabc (1001+1012+1023+974+985+996)%2069 11
Here, all the strings are stored at different indices.
Under certain assumptions, the average time required to search for an element in a hash table is O(1).
Let us take an example of a string S. We need to count the frequency of all the characters in the string.
String S = â€œababcdâ€?
We can achieve this by iterating over all the possible characters and counting their frequency one by one. The time complexity of this approach is O(26*N) where N is the size of the string.
for(char c = â€˜aâ€™;c <= â€˜zâ€™;++c)
{
int frequency = 0;
for(int i = 0;i < S.length();++i)
if(S[i] == c)
frequency++;
System.out.println(c + â€˜ â€™+ endl);
}
Output:
a 2
b 2
c 1
d 1
e 0
f 0
...
z 0
Let us apply hashing to this problem. Take an array frequency of size 26 and hash the 26 characters with indices of the array using the hash function. Then, iterate over the string and increase the value in the frequency at the corresponding index for each other. The complexity of the approach is O(N) where N is the size of the string.
int Frequency[26];
int hashFunc(char c)
{
return (c  â€˜aâ€™);
}
public class HashTable{
public void countFre(string S)
{
for(int i = 0;i < S.length();++i)
{
int index = hashFunc(S[i]);
Frequency[index]++;
}
for(int i = 0;i < 26;++i)
System.out.println((char)(i+â€™aâ€™) + â€˜â€™ + Frequency[i] + endl);
}
}
Output:
a 2
b 2
c 1
d 1
e 0
f 0
...
z 0
Index = hashFun(char)
If all the keys caused hash collisions, we will be at risk to walk through all of the values for a single lookup. We will have one big linked list. This is the worst case.
As the number of keys and values in a hash map exceeds the number of indices, hash collisions are bound to take place. To avoid this, we can expand the array whenever things start to get crowded.
That requires allocating a larger array and rehashing all of the existing keys to figure out their new position O(n) time.
Myself KarthiQ, I am the author of this blog, I know ways to write a good article but some how I donot have the skills to make it to reach people, would you like help me to reach more people By sharing this Article in the social media.
