1.0 Problem Description
Implement data structure: hash table.
2.0 Problem Analysis
- A hash table can be thought of as a dictionary (swift) or object (JS) that we often use, and can be matched to a key&value pair one by one, which can quickly find values based on keys.
- The hash table is implemented internally using arrays. We need to calculate the number corresponding to one and one of the keys of any type. The number size is between 0 and array size.
- The process of calculating a unique number by key is called a hash function. The same key generates the same number each time using the hash function.
- Since different keys may produce the same number, this situation is called hash conflict. Since conflict is inevitable, we have many ways to deal with it.
- Methods for handling conflicts include:
- Open addressing: If the numbers are the same, increment the numbers by 1 and see if there is a value in the array until a space is found
- Linked list: Each position in an array has a linked list that stores conflicting values
- Hash again: Call the hash function several times until there is no conflict
- Public overflow area: Puts conflicting values in another array. If a conflict occurs, the new array is searched.
- The most widely used method is open addressing.
- To improve performance, we need to choose a better hash function, and we need to keep the fill factor low. The fill factor = the number of stored/the total number of current arrays. We need to do two things.
- So there’s less conflict in the hash function
- The fill factor is greater than 0.75, which expands the array to twice its current size
3.0 Code Implementation
3.1 Swift Implementation
/* * The protocol that the hash table Key complies with * is used to generate the hash value, that is, the hash function */
public protocol HashedKey {
var hashValue: UInt32 { get}}/* * The inner element of the hash table represents */
fileprivate struct HashElement<Key.Value>{
var key: Key;
var value: Value;
}
/* * hash table */
public class Hash<Key.Value> where Key: HashedKey.Key: Equatable{
/// internal array
private var _storeArr:[HashElement<Key.Value>? ] ;/// The current number of elements
private var _storeCount = 0;
/// Minimum number of elements
private let MINSIZE = 1;
/* * constructor */
public init() {
_storeArr = Array<HashElement<Key.Value>? >.init(repeating: nil.count: MINSIZE);
}
/** * expands the size of the internal array to 2x */
private func _expand(a){
let oldArr = _storeArr;
_storeArr = Array<HashElement<Key.Value>? >.init(repeating: nil.count: _storeArr.count * 2);
_storeCount = 0;
for case let e? in oldArr {
set(k: e.key, v: e.value); }}/** * Insert Value * @param {Key} -k Insert Key * @param {Value} -v insert Value */
public func set(k: Key, v: Value){
if let i = self._index(k: k){ _storeArr[i]? .value = v;return;
}
let capacity = _storeArr.count;
let hashValue = k.hashValue;
var idx = Int(hashValue) % capacity;
while_storeArr[idx] ! =nil {
idx = (idx + 1) % capacity;
}
_storeArr[idx] = HashElement.init(key: k, value: v);
_storeCount += 1;
if _storeCount >= capacity * 3 / 4{ _expand(); }}@param {Key} -k pass Key * @return {Int? } - Returns array subscript, or nil */ if key does not exist
private func _index(k: Key) -> Int? {let capacity = _storeArr.count;
var idx = Int(k.hashValue) % capacity;
var count = 0;
while_storeArr[idx] ! =nil&& _storeArr[idx]! .key ! = k &&count < capacity {
idx = (idx + 1) % capacity;
count+ =1;
}
if_storeArr[idx] ! =nil && count < capacity {
return idx;
}else{
return nil; }}/**
* 获取数值
* @param {Key} - k 想要获取数值的key
* @return {Value?} - 返回的value
*/
public func get(k: Key) -> Value? {if let i = self._index(k: k){
return_storeArr[i]? .value; }else{
return nil; }}/** * Supports the following operations: * hash[key] = value * let v = hash[key] */
public subscript(k: Key) - >Value? {set{
if case let v? = newValue {
set(k: k, v: v)
}
}
get{
return get(k: k); }}}Copy the code
3.2 Using JS
function Hash(hashFunction) {
this._elements = new Array(1);
this._elementsCount = 0;
this._hashFunc = hashFunction;
}
Hash.prototype._expand = function(){
let capacity = this._elements.length;
let oldElements = this._elements;
this._elements = new Array(capacity * 2);
oldElements.forEach(element= > {
this.set(element.key, element.value);
});
}
Hash.prototype.set = function(k, v){
let index = this._index(k);
if(index ! =undefined && this._elements[index] ! = v){this._elements[index] = v;
return;
}
let capacity = this._elements.length;
let hashValue = this._hashFunc(k) % capacity;
while (this._elements[hashValue]) {
hashValue = (hashValue + 1) % capacity;
}
this._elements[hashValue] = {key:k, value:v};
this._elementsCount++;
if(this._elementsCount >= capacity * 3 / 4) {this._expand();
}
}
Hash.prototype._index = function(k){
let capacity = this._elements.length;
let hashValue = this._hashFunc(k) % capacity;
let count = 0;
while(this._elements[hashValue] && this._elements[hashValue].key ! = k && count < capacity){ hashValue = (hashValue +1) % capacity;
count++;
}
if(this._elements[hashValue] && count < capacity){
return hashValue;
}
}
Hash.prototype.get = function(k){
let index = this._index(k);
if(index ! =undefined) {
return this._elements[index]; }}Copy the code
4.0 Complexity Analysis
- Insert value
- We compute an array index using a hash function, just once.
- Subscripts can conflict, and the number of conflicts is constant. We do that by having a good hash function and a low fill factor.
- When the fill factor reaches the upper limit, you need to create a new array and copy all the values into the new array. This step takes order n time, but if we amortize this step over each of the previous sets, we’ll see that it’s the equivalent of each set times 2, which is constant.
- To sum up, the time complexity of hash table insertion is O(1).
- Get the value
- The process of obtaining the value is similar to that of inserting, but also evaluates the hash function once.
- If there is a conflict, proceed according to the rules. From the above analysis, we know that this step is also a constant level of time complexity.
- To sum up, the time complexity of retrieving the value of the hash table is also O(1).