Partition a collection into “k” close-to-equal pieces (Scala, but language agnostic)

前端 未结 6 2092
南方客
南方客 2021-02-12 15:44

Defined before this block of code:

  • dataset can be a Vector or List
  • numberOfSlices is an Int
6条回答
  •  名媛妹妹
    2021-02-12 16:04

    Here's a one-liner that does the job for me, using the familiar Scala trick of a recursive function that returns a Stream. Notice the use of (x+k/2)/k to round the chunk sizes, intercalating the smaller and larger chunks in the final list, all with sizes with at most one element of difference. If you round up instead, with (x+k-1)/k, you move the smaller blocks to the end, and x/k moves them to the beginning.

    def k_folds(k: Int, vv: Seq[Int]): Stream[Seq[Int]] =
        if (k > 1)
            vv.take((vv.size+k/2)/k) +: k_folds(k-1, vv.drop((vv.size+k/2)/k))
        else
            Stream(vv)
    

    Demo:

    scala> val indices = scala.util.Random.shuffle(1 to 39)
    
    scala> for (ff <- k_folds(7, indices)) println(ff)
    Vector(29, 8, 24, 14, 22, 2)
    Vector(28, 36, 27, 7, 25, 4)
    Vector(6, 26, 17, 13, 23)
    Vector(3, 35, 34, 9, 37, 32)
    Vector(33, 20, 31, 11, 16)
    Vector(19, 30, 21, 39, 5, 15)
    Vector(1, 38, 18, 10, 12)
    
    scala> for (ff <- k_folds(7, indices)) println(ff.size)
    6
    6
    5
    6
    5
    6
    5
    
    scala> for (ff <- indices.grouped((indices.size+7-1)/7)) println(ff)
    Vector(29, 8, 24, 14, 22, 2)
    Vector(28, 36, 27, 7, 25, 4)
    Vector(6, 26, 17, 13, 23, 3)
    Vector(35, 34, 9, 37, 32, 33)
    Vector(20, 31, 11, 16, 19, 30)
    Vector(21, 39, 5, 15, 1, 38)
    Vector(18, 10, 12)
    
    scala> for (ff <- indices.grouped((indices.size+7-1)/7)) println(ff.size)
    6
    6
    6
    6
    6
    6
    3
    

    Notice how grouped does not try to even out the size of all the sub-lists.

提交回复
热议问题