Derivative of neural network with respect to input

前端未结

关注

 3  464

I trained a neural network to do a regression on the sine function and would like to compute the first and second derivative with respect to the input. I tried using the tf.

相关标签:

3条回答

萌比男神i

2021-01-20 18:12

I don't think you can calculate second order derivatives using tf.gradients. Take a look at tf.hessians (what you really want is the diagonal of the Hessian matrix), e.g. [1].

An alternative is to use tf.GradientTape: [2].

[1] https://github.com/gknilsen/pyhessian

[2] https://www.tensorflow.org/api_docs/python/tf/GradientTape

0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-01-20 18:26

One possible explanation for what you observed, could be that your function is not derivable two times. It looks as if there are jumps in the 1st derivative around the extrema. If so, the 2nd derivative of the function doesn't really exist and the plot you get higly depends on how the library handles such places.

Consider the following picture of a non-smooth function, that jumps from 0.5 to -0.5 for all x in {1, 2, ....}. It's slope is 1 in all places except when x is an integer. If you'd try to plot it's derivative, you would probably see a straight line at y=1, which can be easily misinterpreted because if someone just looks at this plot, they could think the function is completely linear and starts from -infinity to +infinity.

If your results are produced by a neural net which uses RELU, you can try to do the same with the sigmoid activation function. I suppose you won't see that many spikes with this function.

0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2021-01-20 18:27

What you learned was the sinus function and not its derivative : during the training process, you are controlling the error with your cost function that takes into account only the values, but it does not control the slope at all : you could have learned a very noisy function but matching the data points exactly.

If you are just using the data point in your cost function, you have no guarantee about the derivative you've learned. However, with some advanced training technics, you could also learn such a derivative : https://arxiv.org/abs/1706.04859

So as a summary, it is not a code issue but only a theoritical issue

0 讨论(0)
发布评论:

提交评论
- 加载中...