written 2.1 years ago by |
Solution:
Hash tables are used to implement maps and set data structures in the most common programming languages.
Hashing is a technique that is used to uniquely identify a specific object from a group of similar objects.
The idea of hashing is to distribute entries (key/value pairs) uniformly across an array. Each element is assigned a key (converted key).
By using that key you can access the element in O (1) time. Using the key, the algorithm (hash function) computes an index that suggests where an entry can be found or inserted.
Hash table:
A hash table is a data structure that is used to store keys/value pairs. It uses a hash function to compute an index into an array in which an element will be inserted or searched.
By using a good hash function, hashing can work well. Under reasonable assumptions, the average time required to search for an element in a hash table is O (1).
Let’s start with a somewhat simplified example: a data structure that can store up to 1000 records with random integer keys.
To distribute the data evenly, we use several shortlists.
All records with keys that end with 000 belong to one list, those with keys that end with 001 belong to another one, and so on.
There is a total of 1000 such lists. This structure can be represented as an array of lists:
where LinkedList denotes a linked list of key-value pairs. Inserting a new record (key, value) is a two-step procedure:
we extract the three last digits of the key, hash = key % 1000, and then insert the key and its value into the list located in table[hash].
This is a constant time operation. A lookup is implemented by,
Since the keys are random, there will be roughly the same number of records in each list.
Since there are 1000 lists and at most 1000 records, there will likely be very few records in the list table[key%1000] and therefore the lookup operation will be fast.