No stream fusion with unsafeUpdate_ in unboxed vector

好久不见. 提交于 2019-12-23 11:46:16

问题


Is it possible to maintain stream fusion when processing a vector if unsafeUpdate_ function is used to update some elements of a vector? The answer seems to be no in the test I did. For the code below, temporary vector is generated in upd function, as confirmed in the core:

module Main where
import Data.Vector.Unboxed as U

upd :: Vector Int -> Vector Int
upd v = U.unsafeUpdate_ v (U.fromList [0]) (U.fromList [2])

sum :: Vector Int -> Int
sum = U.sum . upd

main = print $ Main.sum $ U.fromList [1..3]

In the core, $wupd function is used in sum - as seen below, it generates new bytearray:

$wupd :: Vector Int -> Vector Int
$wupd =
  \ (w :: Vector Int) ->
    case w `cast` ... of _ { Vector ipv ipv1 ipv2 ->
    case main11 `cast` ... of _ { Vector ipv3 ipv4 ipv5 ->
    case main7 `cast` ... of _ { Vector ipv6 ipv7 ipv8 ->
    runSTRep
      (\ (@ s) (s :: State# s) ->
         case >=# ipv1 0 of _ {
           False -> case main6 ipv1 of wild { };
           True ->
             case newByteArray# (*# ipv1 8) (s `cast` ...)
             of _ { (# ipv9, ipv10 #) ->
             case (copyByteArray# ipv2 (*# ipv 8) ipv10 0 (*# ipv1 8) ipv9)
                  `cast` ...

There is a nice, tight loop in the core for sum function but just before that loop, there is a call to $wupd function, and so, a temporary generation.

Is there a way to avoid temporary generation in the example here? The way I think about it, updating a vector in index i is the case of parsing a stream but only acting on the stream in index i (skipping the rest), and replacing the element there with another element. So, updating a vector in an arbitrary location shouldn't break stream fusion, right?


回答1:


I can't be 100% sure, because with vector it's turtles all the way down (you never really reach the actual implementation, there's always another indirection), but as far as I understand it, the update variants force a new temporary through cloning:

unsafeUpdate_ :: (Vector v a, Vector v Int) => v a -> v Int -> v a -> v a
{-# INLINE unsafeUpdate_ #-}
unsafeUpdate_ v is w
  = unsafeUpdate_stream v (Stream.zipWith (,) (stream is) (stream w))

unsafeUpdate_stream :: Vector v a => v a -> Stream (Int,a) -> v a
{-# INLINE unsafeUpdate_stream #-}
unsafeUpdate_stream = modifyWithStream M.unsafeUpdate

and modifyWithStream calls clone (and new),

modifyWithStream :: Vector v a
                 => (forall s. Mutable v s a -> Stream b -> ST s ())
                 -> v a -> Stream b -> v a
{-# INLINE modifyWithStream #-}
modifyWithStream p v s = new (New.modifyWithStream p (clone v) s)

new :: Vector v a => New v a -> v a
{-# INLINE_STREAM new #-}
new m = m `seq` runST (unsafeFreeze =<< New.run m)

-- | Convert a vector to an initialiser which, when run, produces a copy of
-- the vector.
clone :: Vector v a => v a -> New v a
{-# INLINE_STREAM clone #-}
clone v = v `seq` New.create (
  do
    mv <- M.new (length v)
    unsafeCopy mv v
    return mv)

and I see no way that vector would get rid of that unsafeCopy again.




回答2:


If you need to change one or very few elements, there are nice solutions in repa and yarr libraries. They preserve fusion (I'm not sure about repa) and Haskell-idiomatic.

Repa, using fromFunction:

upd arr = fromFunction (extent arr) ix
  where ix (Z .: 0) = 2
        ix i = index arr i

Yarr, using Delayed:

upd arr = Delayed (extent arr) (touchArray arr) (force arr) ix
  where ix 0 = return 2
        ix i = index arr i


来源:https://stackoverflow.com/questions/16995850/no-stream-fusion-with-unsafeupdate-in-unboxed-vector

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!