Mixture of Experts on TensorFlow

后端 未结 1 1467
后悔当初
后悔当初 2021-02-10 17:21

I\'m want to implement a generic module on TensowFlow which receives a list of TensorFlow models (here denoted as experts) and builds from that a Mixture of Experts, as it is de

相关标签:
1条回答
  • 2021-02-10 17:36

    Yes, you can do this in an all-in-one architecture by using a gating placeholder.

    Let's start with a simple tensorflow concept code like this then add to it:

    m = tf.Variable( [width,height] , dtype=tf.float32  ))
    b = tf.Variable( [height] , dtype=tf.float32  ))
    h = tf.sigmoid( tf.matmul( x,m ) + b )
    

    Imagine this is your single "expert" model architecture. I know it is fairly basic, but it will do for our purposes of illustration.

    What we are going to do is store all of the expert systems in the matrix's m and b and define a gating matrix.

    Let's call the gating matrix g. It is going to block specific neural connections. The neural connections are defined in m. This would be your new configuration

    g = tf.placeholder( [width,height] , dtype=tf.float32 )
    m = tf.Variable( [width,height] , dtype=tf.float32  )
    b = tf.Variable( [height] , dtype=tf.float32  )
    h = tf.sigmoid( tf.matmul( x, tf.multiply(m,g) ) + b )
    

    g is a matrix of 1's and 0's. Insert a 1 for every neural connection you want to keep and a 0 for every one you want to block. If you have 4 expert systems, then 1/4th of the connections will be 1's and 3/4ths will be 0s.

    If you want them all to vote equally, then you'll want to set all values of g to 1/4th.

    0 讨论(0)
提交回复
热议问题