Why do we use log probability in deep learning?

陌路散爱 提交于 2021-01-29 06:51:36

问题


I got curious while reading the paper 'Sequence to Sequence Learning with Neural Networks'. In fact, not only this paper but also many other papers use log probabilities, is there a reason for that? Please check the attached photo.


回答1:


For any given problem we need to optimise the likelihood of parameters. But optimising the product require all data at once and requires huge computation.

We know that a sum is a lot easier to optimise as the derivative of a sum is the sum of derivatives. So, taking log convert it to sum and makes computation faster.

Refer this




回答2:


Two reasons -

  1. Theoretical - Probabilities of two independent events A and B co-occurring together is given by P(A).P(B). This easily gets mapped to a sum if we use log, i.e. log(P(A)) + log(P(B)). It is thus easier to address the neuron firing 'events' as a linear function.

  2. Practical - The probability values are in [0, 1]. Hence multiplying two or more such small numbers could easily lead to an underflow in a floating point precision arithmetic (e.g. consider multiplying 0.0001*0.00001). A practical solution is to use the logs to get rid of the underflow.



来源:https://stackoverflow.com/questions/63334122/why-do-we-use-log-probability-in-deep-learning

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!