Removing duplicate elements from an array in Swift

后端 未结 30 2052
遥遥无期
遥遥无期 2020-11-22 00:07

I might have an array that looks like the following:

[1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]

Or, reall

相关标签:
30条回答
  • 2020-11-22 00:39

    The easiest way would be to use NSOrderedSet, that stores unique elements and preserves the elements order. Like:

    func removeDuplicates(from items: [Int]) -> [Int] {
        let uniqueItems = NSOrderedSet(array: items)
        return (uniqueItems.array as? [Int]) ?? []
    }
    
    let arr = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
    removeDuplicates(from: arr)
    
    0 讨论(0)
  • 2020-11-22 00:41

    You can always use a Dictionary, because a Dictionary can only hold unique values. For example:

    var arrayOfDates: NSArray = ["15/04/01","15/04/01","15/04/02","15/04/02","15/04/03","15/04/03","15/04/03"]
    
    var datesOnlyDict = NSMutableDictionary()
    var x = Int()
    
    for (x=0;x<(arrayOfDates.count);x++) {
        let date = arrayOfDates[x] as String
        datesOnlyDict.setValue("foo", forKey: date)
    }
    
    let uniqueDatesArray: NSArray = datesOnlyDict.allKeys // uniqueDatesArray = ["15/04/01", "15/04/03", "15/04/02"]
    
    println(uniqueDatesArray.count)  // = 3
    

    As you can see, the resulting array will not always be in 'order'. If you wish to sort/order the Array, add this:

    var sortedArray = sorted(datesOnlyArray) {
    (obj1, obj2) in
    
        let p1 = obj1 as String
        let p2 = obj2 as String
        return p1 < p2
    }
    
    println(sortedArray) // = ["15/04/01", "15/04/02", "15/04/03"]
    

    .

    0 讨论(0)
  • 2020-11-22 00:42

    If you put both extensions in your code, the faster Hashable version will be used when possible, and the Equatable version will be used as a fallback.

    public extension Sequence where Element: Hashable {
      /// The elements of the sequence, with duplicates removed.
      /// - Note: Has equivalent elements to `Set(self)`.
      @available(
      swift, deprecated: 5.4,
      message: "Doesn't compile without the constant in Swift 5.3."
      )
      var firstUniqueElements: [Element] {
        let getSelf: (Element) -> Element = \.self
        return firstUniqueElements(getSelf)
      }
    }
    
    public extension Sequence where Element: Equatable {
      /// The elements of the sequence, with duplicates removed.
      /// - Note: Has equivalent elements to `Set(self)`.
      @available(
      swift, deprecated: 5.4,
      message: "Doesn't compile without the constant in Swift 5.3."
      )
      var firstUniqueElements: [Element] {
        let getSelf: (Element) -> Element = \.self
        return firstUniqueElements(getSelf)
      }
    }
    
    public extension Sequence {
      /// The elements of the sequences, with "duplicates" removed
      /// based on a closure.
      func firstUniqueElements<Hashable: Swift.Hashable>(
        _ getHashable: (Element) -> Hashable
      ) -> [Element] {
        var set: Set<Hashable> = []
        return filter { set.insert(getHashable($0)).inserted }
      }
    
      /// The elements of the sequence, with "duplicates" removed,
      /// based on a closure.
      func firstUniqueElements<Equatable: Swift.Equatable>(
        _ getEquatable: (Element) -> Equatable
      ) -> [Element] {
        reduce(into: []) { uniqueElements, element in
          if zip(
            uniqueElements.lazy.map(getEquatable),
            AnyIterator { [equatable = getEquatable(element)] in equatable }
          ).allSatisfy(!=) {
            uniqueElements.append(element)
          }
        }
      }
    }
    

    If order isn't important, then you can always just use this Set initializer.

    0 讨论(0)
  • 2020-11-22 00:43

    One more Swift 3.0 solution to remove duplicates from an array. This solution improves on many other solutions already proposed by:

    • Preserving the order of the elements in the input array
    • Linear complexity O(n): single pass filter O(n) + set insertion O(1)

    Given the integer array:

    let numberArray = [10, 1, 2, 3, 2, 1, 15, 4, 5, 6, 7, 3, 2, 12, 2, 5, 5, 6, 10, 7, 8, 3, 3, 45, 5, 15, 6, 7, 8, 7]
    

    Functional code:

    func orderedSet<T: Hashable>(array: Array<T>) -> Array<T> {
        var unique = Set<T>()
        return array.filter { element in
            return unique.insert(element).inserted
        }
    }
    
    orderedSet(array: numberArray)  // [10, 1, 2, 3, 15, 4, 5, 6, 7, 12, 8, 45]
    

    Array extension code:

    extension Array where Element:Hashable {
        var orderedSet: Array {
            var unique = Set<Element>()
            return filter { element in
                return unique.insert(element).inserted
            }
        }
    }
    
    numberArray.orderedSet // [10, 1, 2, 3, 15, 4, 5, 6, 7, 12, 8, 45]
    

    This code takes advantage of the result returned by the insert operation on Set, which executes on O(1), and returns a tuple indicating if the item was inserted or if it already existed in the set.

    If the item was in the set, filter will exclude it from the final result.

    0 讨论(0)
  • 2020-11-22 00:43

    Think like a functional programmer :)

    To filter the list based on whether the element has already occurred, you need the index. You can use enumerated to get the index and map to return to the list of values.

    let unique = myArray
        .enumerated()
        .filter{ myArray.firstIndex(of: $0.1) == $0.0 }
        .map{ $0.1 }
    

    This guarantees the order. If you don't mind about the order then the existing answer of Array(Set(myArray)) is simpler and probably more efficient.


    UPDATE: Some notes on efficiency and correctness

    A few people have commented on the efficiency. I'm definitely in the school of writing correct and simple code first and then figuring out bottlenecks later, though I appreciate it's debatable whether this is clearer than Array(Set(array)).

    This method is a lot slower than Array(Set(array)). As noted in comments, it does preserve order and works on elements that aren't Hashable.

    However, @Alain T's method also preserves order and is also a lot faster. So unless your element type is not hashable, or you just need a quick one liner, then I'd suggest going with their solution.

    Here are a few tests on a MacBook Pro (2014) on Xcode 11.3.1 (Swift 5.1) in Release mode.

    The profiler function and two methods to compare:

    func printTimeElapsed(title:String, operation:()->()) {
        var totalTime = 0.0
        for _ in (0..<1000) {
            let startTime = CFAbsoluteTimeGetCurrent()
            operation()
            let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
            totalTime += timeElapsed
        }
        let meanTime = totalTime / 1000
        print("Mean time for \(title): \(meanTime) s")
    }
    
    func method1<T: Hashable>(_ array: Array<T>) -> Array<T> {
        return Array(Set(array))
    }
    
    func method2<T: Equatable>(_ array: Array<T>) -> Array<T>{
        return array
        .enumerated()
        .filter{ array.firstIndex(of: $0.1) == $0.0 }
        .map{ $0.1 }
    }
    
    // Alain T.'s answer (adapted)
    func method3<T: Hashable>(_ array: Array<T>) -> Array<T> {
        var uniqueKeys = Set<T>()
        return array.filter{uniqueKeys.insert($0).inserted}
    }
    

    And a small variety of test inputs:

    func randomString(_ length: Int) -> String {
      let letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
      return String((0..<length).map{ _ in letters.randomElement()! })
    }
    
    let shortIntList = (0..<100).map{_ in Int.random(in: 0..<100) }
    let longIntList = (0..<10000).map{_ in Int.random(in: 0..<10000) }
    let longIntListManyRepetitions = (0..<10000).map{_ in Int.random(in: 0..<100) }
    let longStringList = (0..<10000).map{_ in randomString(1000)}
    let longMegaStringList = (0..<10000).map{_ in randomString(10000)}
    

    Gives as output:

    Mean time for method1 on shortIntList: 2.7358531951904296e-06 s
    Mean time for method2 on shortIntList: 4.910230636596679e-06 s
    Mean time for method3 on shortIntList: 6.417632102966309e-06 s
    Mean time for method1 on longIntList: 0.0002518167495727539 s
    Mean time for method2 on longIntList: 0.021718120217323302 s
    Mean time for method3 on longIntList: 0.0005312927961349487 s
    Mean time for method1 on longIntListManyRepetitions: 0.00014377200603485108 s
    Mean time for method2 on longIntListManyRepetitions: 0.0007293639183044434 s
    Mean time for method3 on longIntListManyRepetitions: 0.0001843773126602173 s
    Mean time for method1 on longStringList: 0.007168249964714051 s
    Mean time for method2 on longStringList: 0.9114790915250778 s
    Mean time for method3 on longStringList: 0.015888616919517515 s
    Mean time for method1 on longMegaStringList: 0.0525397013425827 s
    Mean time for method2 on longMegaStringList: 1.111266262292862 s
    Mean time for method3 on longMegaStringList: 0.11214958941936493 s
    
    0 讨论(0)
  • 2020-11-22 00:47

    Swift 4

    Guaranteed to keep ordering.

    extension Array where Element: Equatable {
        func removingDuplicates() -> Array {
            return reduce(into: []) { result, element in
                if !result.contains(element) {
                    result.append(element)
                }
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题