How many FLOPs does tanh need?

后端 未结 2 553
星月不相逢
星月不相逢 2021-02-14 18:15

I would like to compute how many flops each layer of LeNet-5 (paper) needs. Some papers give FLOPs for other architectures in total (1, 2, 3) However, those papers don\'t give d

2条回答
  •  闹比i
    闹比i (楼主)
    2021-02-14 18:48

    Note: This answer is not python specific, but I don't think that something like tanh is fundamentally different across languages.

    Tanh is usually implemented by defining an upper and lower bound, for which 1 and -1 is returned, respectively. The intermediate part is approximated with different functions as follows:

     Interval 0  x_small               x_medium               x_large 
      tanh(x) |  x  |  polynomial approx.  |  1-(2/(1+exp(2x)))  |  1
    

    There exist polynomials that are accurate up to single precisision floating points, and also for double precision. This algorithm is called Cody-Waite algorithm.

    Citing this description (you can find more information about the mathematics there as well, e.g. how to determine x_medium), Cody and Waite’s rational form requires four multiplications, three additions, and one division in single precision, and seven multiplications, six additions, and one division in double precision.

    For negative x, you can compute |x| and flip the sign. So you need comparisons for which interval x is in, and evaluate the according approximation. That's a total of:

    1. Taking the absolute value of x
    2. 3 comparisons for the interval
    3. Depending on the interval and the float precision, 0 to a few FLOPS for the exponential, check this question on how to compute the exponential.
    4. One comparison to decide whether to flip the sign.

    Now, this is a report from 1993, but I don't think much has changed here.

提交回复
热议问题