Second derivative in Keras

前端 未结 2 1700
伪装坚强ぢ
伪装坚强ぢ 2020-12-09 23:15

For a custom loss for a NN I use the function . u, given a pair (t,x), both points in an interval, is the the output of my NN. Problem is I\'m stuck at how

相关标签:
2条回答
  • 2020-12-09 23:41

    In order for a K.gradients() layer to work like that, you have to enclose it in a Lambda() layer, because otherwise a full Keras layer is not created, and you can't chain it or train through it. So this code will work (tested):

    import keras
    from keras.models import *
    from keras.layers import *
    from keras import backend as K
    import tensorflow as tf
    
    def grad( y, x ):
        return Lambda( lambda z: K.gradients( z[ 0 ], z[ 1 ] ), output_shape = [1] )( [ y, x ] )
    
    def network( i, d ):
        m = Add()( [ i, d ] )
        a = Lambda(lambda x: K.log( x ) )( m )
        return a
    
    fixed_input = Input(tensor=tf.constant( [ 1.0 ] ) )
    double = Input(tensor=tf.constant( [ 2.0 ] ) )
    
    a = network( fixed_input, double )
    
    b = grad( a, fixed_input )
    c = grad( b, fixed_input )
    d = grad( c, fixed_input )
    e = grad( d, fixed_input )
    
    model = Model( inputs = [ fixed_input, double ], outputs = [ a, b, c, d, e ] )
    
    print( model.predict( x=None, steps = 1 ) )
    

    def network models f( x ) = log( x + 2 ) at x = 1. def grad is where the gradient calculation is done. This code outputs:

    [array([1.0986123], dtype=float32), array([0.33333334], dtype=float32), array([-0.11111112], dtype=float32), array([0.07407408], dtype=float32), array([-0.07407409], dtype=float32)]

    which are the correct values for log( 3 ), , -1 / 32, 2 / 33, -6 / 34.


    Reference TensorFlow code

    For reference, the same code in plain TensorFlow (used for testing):

    import tensorflow as tf
    
    a = tf.constant( 1.0 )
    a2 = tf.constant( 2.0 )
    
    b = tf.log( a + a2 )
    c = tf.gradients( b, a )
    d = tf.gradients( c, a )
    e = tf.gradients( d, a )
    f = tf.gradients( e, a )
    
    with tf.Session() as sess:
        print( sess.run( [ b, c, d, e, f ] ) )
    

    outputs the same values:

    [1.0986123, [0.33333334], [-0.11111112], [0.07407408], [-0.07407409]]

    Hessians

    tf.hessians() does return the second derivative, that's a shorthand for chaining two tf.gradients(). The Keras backend doesn't have hessians though, so you do have to chain the two K.gradients().

    Numerical approximation

    If for some reason none of the above works, then you might want to consider numerically approximating the second derivative with taking the difference over a small ε distance. This basically triples the network for each input, so this solution introduces serious efficiency considerations, besides lacking in accuracy. Anyway, the code (tested):

    import keras
    from keras.models import *
    from keras.layers import *
    from keras import backend as K
    import tensorflow as tf
    
    def network( i, d ):
        m = Add()( [ i, d ] )
        a = Lambda(lambda x: K.log( x ) )( m )
        return a
    
    fixed_input = Input(tensor=tf.constant( [ 1.0 ], dtype = tf.float64 ) )
    double = Input(tensor=tf.constant( [ 2.0 ], dtype = tf.float64 ) )
    
    epsilon = Input( tensor = tf.constant( [ 1e-7 ], dtype = tf.float64 ) )
    eps_reciproc = Input( tensor = tf.constant( [ 1e+7 ], dtype = tf.float64 ) )
    
    a0 = network( Subtract()( [ fixed_input, epsilon ] ), double )
    a1 = network(               fixed_input,              double )
    a2 = network(      Add()( [ fixed_input, epsilon ] ), double )
    
    d0 = Subtract()( [ a1, a0 ] )
    d1 = Subtract()( [ a2, a1 ] )
    
    dv0 = Multiply()( [ d0, eps_reciproc ] )
    dv1 = Multiply()( [ d1, eps_reciproc ] )
    
    dd0 = Multiply()( [ Subtract()( [ dv1, dv0 ] ), eps_reciproc ] )
    
    model = Model( inputs = [ fixed_input, double, epsilon, eps_reciproc ], outputs = [ a0, dv0, dd0 ] )
    
    print( model.predict( x=None, steps = 1 ) )
    

    Outputs:

    [array([1.09861226]), array([0.33333334]), array([-0.1110223])]

    (This only gets to the second derivative.)

    0 讨论(0)
  • 2020-12-09 23:54

    Solution posted by Peter Szoldan is an excellent one. But it seems like the way keras.layers.Input() take in arguments has changed since the latest version with tf2 backend. The following simple fix will work though:

    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.keras import backend as K
    import numpy as np
    
    class CustomModel(tf.keras.Model):
    
        def __init__(self):
            super(CustomModel, self).__init__()
            self.input_layer = Lambda(lambda x: K.log( x + 2 ) )
    
        def findGrad(self,func,argm):
            return keras.layers.Lambda(lambda x: K.gradients(x[0],x[1])) ([func,argm])
        
        def call(self, inputs):
            log_layer = self.input_layer(inputs)
            gradient_layer = self.findGrad(log_layer,inputs)
            hessian_layer = self.findGrad(gradient_layer, inputs)
            return hessian_layer
    
    
    custom_model = CustomModel()
    x = np.array([[0.],
                [1],
                [2]])
    custom_model.predict(x) 
    
    
    • Going through layers: input layer-> lambda layer appylying log(x+2) -> lambda layer applying gradient -> one more lambda layer applying gradeint -> Output.
    • Note that this solution is for a general custom model and if you are using functional api, it should be similar.
    • If you are using tf backend, then using tf.hessians, instead of applying K.gradients twice, will work as well.
    0 讨论(0)
提交回复
热议问题