Can anybody optimize following statement in Scala:
// maybe large
val someArray = Array(9, 1, 6, 2, 1, 9, 4, 5, 1, 6, 5, 0, 6)
// output a sorted list whic
How about adding everything to a sorted set?
val a = scala.collection.immutable.SortedSet(someArray filter (0 !=): _*)
Of course, you should benchmark the code to check what is faster, and, more importantly, that this is truly a hot spot.
For efficiency, depending on your value of large:
val a = someArray.toSet.filter(_>0).toArray
java.util.Arrays.sort(a) // quicksort, mutable data structures bad :-)
res15: Array[Int] = Array(1, 2, 4, 5, 6, 9)
Note that this does the sort using qsort on an unboxed array.
I'm not in a position to measure, but some more suggestions...
Sorting the array in place before converting to a list might well be more efficient, and you might look at removing dups from the sorted list manually, as they will be grouped together. The cost of removing 0's before or after the sort will also depend on their ratio to the other entries.
I haven't measured, but I'm with Duncan, sort in place then use something like:
util.Sorting.quickSort(array)
array.foldRight(List.empty[Int]){
case (a, b) =>
if (!b.isEmpty && b(0) == a)
b
else
a :: b
}
In theory this should be pretty efficient.
Without benchmarking I can't be sure, but I imagine the following is pretty efficient:
val list = collection.SortedSet(someArray.filter(_>0) :_*).toList
Also try adding .par
after someArray in your version. It's not guaranteed to be quicker, bit it might be. You should run a benchmark and experiment.
sort
is deprecated. Use .sortWith(_ > _)
instead.
This simple line is one of the fastest codes so far:
someArray.toList.filter (_ > 0).sortWith (_ > _).distinct
but the clear winner so far is - due to my measurement - Jed Wesley-Smith. Maybe if Rex' code is fixed, it looks different.
Typical disclaimer 1 + 2:
Here is the underlying benchcoat-code and the concrete code to produce the graph (gnuplot). Y-axis: time in seconds. X-axis: 100 000 to 1 000 000 elements in Array.
After finding the problem with Rex' code, his code is as fast as Jed's code, but the last operation is a transformation of his Array to a List (to fullfill my benchmark-interface). Using a var result = List [Int]
, and result = someArray (i) :: result
speeds his code up, so that it is about twice as fast as the Jed-Code.
Another, maybe interesting, finding is: If I rearrange my code in the order of filter/sort/distinct (fsd) => (dsf, dfs, fsd, ...), all 6 possibilities don't differ significantly.