This article is a translation of Swift Algorithm Club. Swift Algorithm Club is an open source project produced by Raywenderlich.com to realize algorithms and data structures by Swift. Currently, there are 18000+⭐️ on GitHub. I have made a brief statistics, and there are about 100 algorithms and data structures. It’s basically a good resource for iOSer learning algorithms and data structures. 🐙andyRon/ swift-algorithm-club-CN is my project of learning and translating for Swift Algorithm Club. Due to the limited capacity, if you find any errors or improper translation, please correct them. Welcome to Pull Request. Also welcome interested, have time partners to participate in translation and learning 🤓. Of course, we welcome ⭐️, 🤩🤩 ️ ⭐ ⭐. A translation of this article and the code can be found at 🐙swift-algorithm-club-cn/Hash Set


Hash Set

A set is a collection of elements, somewhat like an array but with two important differences: the order of the elements in the set does not matter, and each element can only appear once.

If the following were arrays, they would all be different. However, they all represent the same set:

[1 ,2.3]
[2.1.3]
[3.2.1]
[1.2.2.3.1]
Copy the code

Because each element can only appear once, it doesn’t matter how many times the element is written — only one of them is valid.

** Note: ** I often prefer to use collections on arrays when I have a set of objects but don’t care about their order. Using collections to communicate with programmers, the order of elements is not important. If you’re using arrays, then you can’t assume the same thing.

Typical collection operations:

  • Insert elements
  • Remove elements
  • Checks if the collection contains elements
  • To combine with another
  • Cross with another group
  • Calculate the difference with the other group

Union, intersection, and difference sets are ways of combining two sets into one set:

Starting with Swift 1.2, the library includes a built-in Set type, but here I’ll show you how to make your own. You won’t use it in production code, but it’s good to know how to implement collections.

It is possible to implement collections using simple arrays, but this is not the most efficient approach. Instead, we’ll use a dictionary. Since Swift’s dictionary is built using hash tables, our own collection will be a hash set.

code

Here is the beginning of a HashSet in Swift:

public struct HashSet<T: Hashable> {
    fileprivate var dictionary = Dictionary<T.Bool> ()public init() {}public mutating func insert(_ element: T) {
        dictionary[element] = true
    }

    public mutating func remove(_ element: T) {
        dictionary[element] = nil
    }

    public func contains(_ element: T) -> Bool {
        returndictionary[element] ! =nil
    }

    public func allElements(a)- > [T] {
        return Array(dictionary.keys)
    }

    public var count: Int {
        return dictionary.count
    }

    public var isEmpty: Bool {
        return dictionary.isEmpty
    }
}
Copy the code

The code is pretty simple because we rely on Swift’s built-in Dictionary to do all the hard work. The reason we use dictionaries is that dictionary keys must be unique, just like elements in a collection. In addition, dictionaries have **O(1)** time complexity in most of their operations, making the collection implementation very fast.

Since we are using a dictionary, the generic type T must conform to Hashable. You can put any type of object into our collection as long as it can be hashed. (The same is true for Swift’s own Set.)

Typically, you use dictionaries to associate keys with values, but for a collection, we only care about keys. That’s why we use Bool as the dictionary value type, even though we only set it to true, not false. (We could have chosen anything, but Booleans took up the least space.)

Copy the code to the playground and add some tests:

var set = HashSet<String> ()set.insert("one")
set.insert("two")
set.insert("three")
set.allElements()      // ["one, "three", "two"]

set.insert("two")
set.allElements()      // still ["one, "three", "two"]

set.contains("one")    // true
set.remove("one")
set.contains("one")    // false
Copy the code

The allElements() function converts the contents of the collection to an array. Note that the order of the elements in this array may be different from the order in which items were added. As I said, a set doesn’t care about the order of elements (nor is it a dictionary).

Merge collection

A lot of the use of collections is how to combine them. (If you’ve ever used a vector drawing program like Sketch or Illustrator, you’ll see the options Union, Subtract, Intersect to combine shapes. Same thing here.)

Here’s the code for the union operation:

extension HashSet {
    public func union(_ otherSet: HashSet<T>) -> HashSet<T> {
        var combined = HashSet<T> ()for obj in self.dictionary.keys {
            combined.insert(obj)
        }
        for obj in otherSet.dictionary.keys {
            combined.insert(obj)
        }
        return combined
    }
}
Copy the code

The union of the two sets creates A new set consisting of all the elements in set A plus all the elements in set B. Of course, if there are duplicate elements, they are evaluated only once.

Example:

var setA = HashSet<Int>()
setA.insert(1)
setA.insert(2)
setA.insert(3)
setA.insert(4)

var setB = HashSet<Int>()
setB.insert(3)
setB.insert(4)
setB.insert(5)
setB.insert(6)

let union = setA.union(setB)
union.allElements()           // [5, 6, 2, 3, 1, 4]
Copy the code

As you can see, the union of the two collections now contains all elements. The values 3 and 4 still occur only once, even though they are in both groups.

The intersection of two collections contains only the elements they share. Here’s the code:

extension HashSet {
    public func intersect(_ otherSet: HashSet<T>) -> HashSet<T> {
        var common = HashSet<T> ()for obj in dictionary.keys {
            if otherSet.contains(obj) {
                common.insert(obj)
            }
        }
        return common
    }
}
Copy the code

Testing:

let intersection = setA.intersect(setB)
intersection.allElements()
Copy the code

This prints [3, 4] because those are the only objects in set A that are also set B.

Finally, the difference between the two groups removes the elements they have in common. The code is as follows:

extension HashSet {
    public func difference(_ otherSet: HashSet<T>) -> HashSet<T> {
        var diff = HashSet<T> ()for obj in dictionary.keys {
            if! otherSet.contains(obj) {
                diff.insert(obj)
            }
        }
        return diff
    }
}
Copy the code

It’s actually the opposite of intersect(). Give it a try:

let difference1 = setA.difference(setB)
difference1.allElements()                / / (2, 1)

let difference2 = setB.difference(setA)
difference2.allElements()                / / [5, 6]
Copy the code

Where to go from here?

If you look at the documentation for Swift’s own Set, you’ll see that it has more functionality. One obvious extension is to make HashSet conform to SequenceType, so you can use for… The in loop iterates over it.

Another thing you can do is replace the Dictionary with an actual hash table, but store only the keys and don’t associate them with anything. So you don’t need a Bool anymore.

If you often need to find out whether elements belong to a collection and perform a union, then a concurrent lookup of a collection data structure might be more appropriate. It uses a tree structure rather than a dictionary to make lookup and union operations very efficient.

** NOTE: ** I want HashSet to conform to ArrayLiteralConvertible, so you can write let setA: HashSet

= [1, 2, 3, 4] but for now this will crash the compiler.

Translated by Andy Ron proofread by Andy Ron