[Home]Hash table

HomePage | Recent Changes | Preferences

Difference (from prior major revision) (minor diff, author diff)

Changed: 1c1
The hash table, also known as the associative array, is a convenient data structure to hold key => value association?s.
The hash table is a term in computer science referring to a data structure that implements an [associative array]?. Like any associative array a hash table is used to store many key => value associations?.

Changed: 3c3
A hash function takes arbitrary data as input and deterministic?ally returns an integer. This integer is then the index into the table?, which typically takes the form of an array. Whilst keys are most commonly strings, there is no requirement that they be so.
A hash table maintains two arrays, one for keys, one for values (or possibly one array of (key, value) pairs - it doesn't really matter). The elements of these arrays are referred to as buckets?. When required to find the associated value for a given key, the key is fed through a hash function to yield an integer (called the [hash value]?). This integer is then the index to the associated value.

Changed: 5c5
Hash tables offer O(1) (see Big O and Complexity theory) insertions, lookups, and deletions, with the caveat that collisions need to be dealt with. Techniques for dealing with collisions include probing, chaining, and rehashing. Note that probing and chaning are generally mutually exclusive.
Hash tables offer insertions, lookups, and deletions in O(1) amortized time, with the caveat that hash collisions need to be dealt with. A collision is when two different keys hash to the same integer (they have the same hash value). Techniques for dealing with collisions include probing, chaining, and rehashing. Note that probing and chaining are generally mutually exclusive.

Changed: 7c7
Probing is, upon finding a collision, moving to the next free bucket? in the table. Chaining involves using each bucket as a pointer to another structure, such as an array, a linked list, or even another hash (preferably with a different size and/or hashing function). table?, which typically takes the form of an array. Whilst keys are most commonly strings, there is no requirement that they be so.
Probing is, upon finding a collision, moving to the next free bucket in the table (or incrementing by some number other than 1). Chaining involves using each bucket as a pointer to another structure, such as an array, a linked list, or even another hash (preferably with a different size and/or hashing function).

Changed: 9c9
Both probing and chaining can be problematic because, as more and more elements are added to the hash, the O(1) property is lost! Rehashing is a technique to deal with this: by increasing the size of the table and recomputing the hash values with respect to the new table size, the O(1) property can be maintained.
Both probing and chaining can be problematic because, as more and more elements are added to the hash, the O(1) property is lost! Rehashing is a technique to minimize this effect: by increasing the size of the table and recomputing the hash values with respect to the new table size, the O(1) property can be maintained. Hash tables work faster the fewer occupied entries they have because there will be fewer collisions. If the ratio of occupied entries to total entries is kept below some fixed constant then the performance of hash tables will be reasonable.

Added: 11a12,13

/Talk

The hash table is a term in computer science referring to a data structure that implements an [associative array]?. Like any associative array a hash table is used to store many key => value associations?.

A hash table maintains two arrays, one for keys, one for values (or possibly one array of (key, value) pairs - it doesn't really matter). The elements of these arrays are referred to as buckets?. When required to find the associated value for a given key, the key is fed through a hash function to yield an integer (called the [hash value]?). This integer is then the index to the associated value.

Hash tables offer insertions, lookups, and deletions in O(1) amortized time, with the caveat that hash collisions need to be dealt with. A collision is when two different keys hash to the same integer (they have the same hash value). Techniques for dealing with collisions include probing, chaining, and rehashing. Note that probing and chaining are generally mutually exclusive.

Probing is, upon finding a collision, moving to the next free bucket in the table (or incrementing by some number other than 1). Chaining involves using each bucket as a pointer to another structure, such as an array, a linked list, or even another hash (preferably with a different size and/or hashing function).

Both probing and chaining can be problematic because, as more and more elements are added to the hash, the O(1) property is lost! Rehashing is a technique to minimize this effect: by increasing the size of the table and recomputing the hash values with respect to the new table size, the O(1) property can be maintained. Hash tables work faster the fewer occupied entries they have because there will be fewer collisions. If the ratio of occupied entries to total entries is kept below some fixed constant then the performance of hash tables will be reasonable.

Widely useful, hash tables are found in a wide variety of programs.

/Talk


HomePage | Recent Changes | Preferences
This page is read-only | View other revisions
Last edited December 12, 2001 7:34 am by Taw (diff)
Search: