taking the gradient in Tensorflow, tf.gradient

人盡茶涼 提交于 2019-12-24 18:23:32

问题


I am using this function of tensorflow to get my function jacobian. Came across two problems:

  1. The tensorflow documentation is contradicted to itself in the following two paragraph if I am not mistaken:

gradients() adds ops to the graph to output the partial derivatives of ys with respect to xs. It returns a list of Tensor of length len(xs) where each tensor is the sum(dy/dx) for y in ys. Blockquote

Blockquote Returns: A list of sum(dy/dx) for each x in xs. Blockquote

According to my test, it is, in fact, return a vector of len(ys) which is the sum(dy/dx) for each x in xs.

  1. I do not understand why they designed it in a way that the return is the sum of the columns(or row, depending on how you define your Jacobian).

  2. How can I really get the Jacobian?

4.In the loss, I need the partial derivative of my function with respect to input (x), but when I am optimizing with respect to the network weights, I define x as a placeholder whose value is fed later, and weights are variable, in this case, can I still define the symbolic derivative of function with respect to input (x)? and put it in the loss? ( which later when we optimize with respect to weights will bring second order derivative of the function.)


回答1:


  1. I think you are right and there is a typo there, it was probably meant to be "of length len(ys)".

  2. For efficiency. I can't explain exactly the reasoning, but this seems to be a pretty fundamental characteristic of how TensorFlow handles automatic differentiation. See issue #675.

  3. There is no straightforward way to get the Jacobian matrix in TensorFlow. Take a look at this answer and again issue #675. Basically, you need one call to tf.gradients per column/row.

  4. Yes, of course. You can compute whatever gradients you want, there is no real difference between a placeholder and any other operation really. There are a few operations that do not have a gradient because it is not well defined or not implemented (in which case it will generally return 0), but that's all.



来源:https://stackoverflow.com/questions/46425630/taking-the-gradient-in-tensorflow-tf-gradient

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!