问题
Is it possible to maintain stream fusion when processing a vector
if unsafeUpdate_
function is used to update some elements of a vector
? The answer seems to be no in the test I did. For the code below, temporary vector is generated in upd
function, as confirmed in the core:
module Main where
import Data.Vector.Unboxed as U
upd :: Vector Int -> Vector Int
upd v = U.unsafeUpdate_ v (U.fromList [0]) (U.fromList [2])
sum :: Vector Int -> Int
sum = U.sum . upd
main = print $ Main.sum $ U.fromList [1..3]
In the core, $wupd
function is used in sum
- as seen below, it generates new bytearray
:
$wupd :: Vector Int -> Vector Int
$wupd =
\ (w :: Vector Int) ->
case w `cast` ... of _ { Vector ipv ipv1 ipv2 ->
case main11 `cast` ... of _ { Vector ipv3 ipv4 ipv5 ->
case main7 `cast` ... of _ { Vector ipv6 ipv7 ipv8 ->
runSTRep
(\ (@ s) (s :: State# s) ->
case >=# ipv1 0 of _ {
False -> case main6 ipv1 of wild { };
True ->
case newByteArray# (*# ipv1 8) (s `cast` ...)
of _ { (# ipv9, ipv10 #) ->
case (copyByteArray# ipv2 (*# ipv 8) ipv10 0 (*# ipv1 8) ipv9)
`cast` ...
There is a nice, tight loop in the core for sum
function but just before that loop, there is a call to $wupd
function, and so, a temporary generation.
Is there a way to avoid temporary generation in the example here? The way I think about it, updating a vector in index i is the case of parsing a stream but only acting on the stream in index i (skipping the rest), and replacing the element there with another element. So, updating a vector in an arbitrary location shouldn't break stream fusion, right?
回答1:
I can't be 100% sure, because with vector it's turtles all the way down (you never really reach the actual implementation, there's always another indirection), but as far as I understand it, the update
variants force a new temporary through cloning:
unsafeUpdate_ :: (Vector v a, Vector v Int) => v a -> v Int -> v a -> v a
{-# INLINE unsafeUpdate_ #-}
unsafeUpdate_ v is w
= unsafeUpdate_stream v (Stream.zipWith (,) (stream is) (stream w))
unsafeUpdate_stream :: Vector v a => v a -> Stream (Int,a) -> v a
{-# INLINE unsafeUpdate_stream #-}
unsafeUpdate_stream = modifyWithStream M.unsafeUpdate
and modifyWithStream
calls clone
(and new
),
modifyWithStream :: Vector v a
=> (forall s. Mutable v s a -> Stream b -> ST s ())
-> v a -> Stream b -> v a
{-# INLINE modifyWithStream #-}
modifyWithStream p v s = new (New.modifyWithStream p (clone v) s)
new :: Vector v a => New v a -> v a
{-# INLINE_STREAM new #-}
new m = m `seq` runST (unsafeFreeze =<< New.run m)
-- | Convert a vector to an initialiser which, when run, produces a copy of
-- the vector.
clone :: Vector v a => v a -> New v a
{-# INLINE_STREAM clone #-}
clone v = v `seq` New.create (
do
mv <- M.new (length v)
unsafeCopy mv v
return mv)
and I see no way that vector
would get rid of that unsafeCopy
again.
回答2:
If you need to change one or very few elements, there are nice solutions in repa
and yarr
libraries. They preserve fusion (I'm not sure about repa
) and Haskell-idiomatic.
Repa, using fromFunction:
upd arr = fromFunction (extent arr) ix
where ix (Z .: 0) = 2
ix i = index arr i
Yarr, using Delayed:
upd arr = Delayed (extent arr) (touchArray arr) (force arr) ix
where ix 0 = return 2
ix i = index arr i
来源:https://stackoverflow.com/questions/16995850/no-stream-fusion-with-unsafeupdate-in-unboxed-vector