How to derive FRP from Directed Acyclic Graphs?

后端 未结 1 1542
一个人的身影
一个人的身影 2021-02-05 15:06

I am currently researching for my next project. This is in a pre-planning phase so this question is just to get an overview on existing technology.

Setup

I hav

相关标签:
1条回答
  • 2021-02-05 16:07

    Problems like this are usually encoded against either an Applicative or Arrow abstraction. I'll only discuss Applicative. The Applicative type class, found in Control.Applicative, allows values and functions to be provided via pure and functions to be applied to values with <*>.

    class Functor f => Applicative f where
        -- | Lift a value.
        pure :: a -> f a
    
        -- | Sequential application.
        (<*>) :: f (a -> b) -> f a -> f b
    

    Your example graph could be very simply encoded for an Applicative (replacing each node with addition) as

    example1 :: (Applicative f, Num a) => f a -> f a -> f a -> f (a, a, a)
    example1 five seven three = 
        let
            eleven = pure (+) <*> five   <*> seven
            eight  = pure (+) <*> seven  <*> three
            two    = pure id  <*> eleven
            nine   = pure (+) <*> eleven <*> eight
            ten    = pure (+) <*> eleven <*> three
        in
            pure (,,) <*> two <*> nine <*> ten
    

    The same encoding can be created programmatically from the representation of a graph by traversing the graph such that each node will be visited after all of its dependencies.

    There are three optimizations you may desire that can't be implemented for a network encoded using only Applicative. The general strategy is to encode the problem in terms of Applicative and a few additional classes as needed for optimization or extra features. You then provide one or more interpreters that implement the necessary classes. This lets you separate the problem from the implementation, allowing you to write your own interpreter or use an existing library like reactive, reactive-banana, or mvc-updates. I am not going to discuss how to write these interpreters or adapt the representation given here to a specific interpreter. I'm only going to discuss the common representation of the program that's needed in order for the underlying interpreter to be able to exploit these optimizations. All three of the libraries I linked can avoid recomputing values and can accommodate the following optimizations.

    Observable Sharing

    In the previous example, the intermediate node eleven is only defined once, but is used in three different places. An implementation of Applicative won't be able to see through the referential transparency to see that these three elevens are all the same. You can either assume that the implementation uses some IO magic to peek through referential transparency or define the network so that an implementation can see that a computation is being reused.

    The following is the class of Applicative Functors where the result of a computation can be divided and reused in multiple computations. This class isn't defined in a library anywhere that I know of.

    class Applicative f => Divisible f where
        (</>) :: f a -> (f a -> f b) -> f b
    
    infixl 2 </>
    

    Your example can then be very simply encoded for a Divisible Functor as

    example2 :: (Divisible f, Num a) => f a -> f a -> f a -> f (a, a, a)
    example2 five seven three = 
        pure (+) <*> five   <*> seven </> \eleven ->
        pure (+) <*> seven  <*> three </> \eight  ->
        pure id  <*> eleven           </> \two    ->
        pure (+) <*> eleven <*> eight </> \nine   ->
        pure (+) <*> eleven <*> three </> \ten    ->
        pure (,,) <*> two <*> nine <*> ten
    

    Sums and Abelian Groups

    A typical neuron computes a weighted sum of its inputs and applies a response function to that sum. For a neuron with a large in degree, summing all of its inputs is time consuming. An easy optimization for updating a sum is to subtract the old value and add the new value. This exploits three properties of addition:

    Inverse - a * b * b⁻¹ = a Subtraction is the inverse of adding a number, this inverse allows us to remove the previously added number from the total

    Commutativity - a * b = b * a Addition and subtraction reach the same result regardless of the order they are performed in. This lets us reach the same result when we subtract the old value and add the new value to the total even if the old value wasn't the most recently added value.

    Associativity - a * (b * c) = (a * b) * c Addition and subtraction can be performed in arbitrary groupings and still reach the same result. This lets us reach the same result when we subtract the old value and add the new value to the total even it the old value was added somewhere in the middle of the additions.

    Any structure with these properties as well as closure and an identity is an Abelian group. The following dictionary holds enough information for an underlying library to perform the same optimization

    data Abelian a = Abelian {
        id  :: a,
        inv :: a -> a,
        op  :: a -> a -> a
    }
    

    This is the class of structures that can total Abelian groups

    class Total f where 
        total :: Abelian a -> [f a] -> f a
    

    Similar optimization is possible for the construction of maps.

    Thresholding and Equality

    Another typical operation in neural networks is to compare a value to a threshold and determine the response entirely based on whether or not the value passed the threshold. If an update to an input doesn't change which side of the threshold the value falls on, the response doesn't change. If the response doesn't change, there's no reason to recalculate all of the downstream nodes. The ability to either detect that there was no change to the threshold Bool or the response is equality. The following is the class of structures that can exploit equality. stabilize provides the Eq instance to the underlying structure.

    class Stabilizes f where
        stabilize :: Eq a => f a -> f a
    
    0 讨论(0)
提交回复
热议问题