Inverse Error Function in C

两盒软妹~` 提交于 2019-12-04 02:57:13

Quick & dirty, tolerance under +-6e-3. Work based on "A handy approximation for the error function and its inverse" by Sergei Winitzki.

C/C++ CODE:

float myErfInv2(float x){
   float tt1, tt2, lnx, sgn;
   sgn = (x < 0) ? -1.0f : 1.0f;

   x = (1 - x)*(1 + x);        // x = 1 - x*x;
   lnx = logf(x);

   tt1 = 2/(PI*0.147) + 0.5f * lnx;
   tt2 = 1/(0.147) * lnx;

   return(sgn*sqrtf(-tt1 + sqrtf(tt1*tt1 - tt2)));
}

MATLAB sanity check:

clear all,  close all, clc 
x = linspace(-1, 1,10000); 

a = 0.147;
u = log(1-x.^2);  
u1 = 2/(pi*a) + u/2;    u2 = u/a;
y = sign(x).*sqrt(-u1+sqrt(u1.^2 - u2)); 

f = erfinv(x); axis equal

figure(1);
plot(x, [y;f]);

figure(2);
e = f-y;
plot(x, e);

MATLAB Plots:

At this time, the ISO C standard math library does not include erfinv(), or its single-precision variant erfinvf(). However, it is not too difficult to create one's own version, which I demonstrate below with an implementation of erfinvf() of reasonable accuracy and performance.

Looking at the graph of the inverse error function we observe that it is highly non-linear and is therefore difficult to approximate with a polynomial. One strategy do deal with this scenario is to "linearize" such a function by compositing it from simpler elementary functions (which can themselves be computed at high performance and excellent accuracy) and a fairly linear function which is more easily amenable to polynomial approximations or rational approximations of low degree.

Here are some approaches to erfinv linearization known from the literature, all of which are based on logarithms. Typically, authors differentiate between a main, fairly linear portion of the inverse error function from zero to a switchover point very roughly around 0.9 and a tail portion from the switchover point to unity. In the following, log() denotes the natural logarithm, R() denotes a rational approximation, and P() denotes a polynomial approximation.

A. J. Strecok, "On the Calculation of the Inverse of the Error Function." Mathematics of Computation, Vol. 22, No. 101 (Jan. 1968), pp. 144-158 (online)

β(x) = (-log(1-x2]))½; erfinv(x) = x · R(x2) [main]; R(x) · β(x) [tail]

J. M. Blair, C. A. Edwards, J. H. Johnson, "Rational Chebyshev Approximations for the Inverse of the Error Function." Mathematics of Computation, Vol. 30, No. 136 (Oct. 1976), pp. 827-830 (online)

ξ = (-log(1-x)); erfinv(x) = x · R(x2) [main]; ξ-1 · R(ξ) [tail]

M. Giles, "Approximating the erfinv function." In GPU Computing Gems Jade Edition, pp. 109-116. 2011. (online)

w = -log(1-x2); s = √w; erfinv(x) = x · P(w) [main]; x · P(s) [tail]

The solution below generally follows the approach by Giles, but simplifies it in not requiring the square root for the tail portion, i.e. it uses two approximations of the type x · P(w). The code takes maximum advantage of the fused multiply-add operation FMA, which is exposed via the standard math functions fma() and fmaf() in C. Many common compute platforms, such as IBM Power, Arm64, x86-64, and GPUs offer this operation in hardware. Where no hardware support exists, the use of fma{f}() will likely make the code below unacceptably slow as the operation needs to be emulated by the standard math library. Also, functionally incorrect emulations of FMA are known to exist.

The accuracy of the standard math library's logarithm function logf() will have some impact on the accuracy of my_erfinvf() below. As long as the library provides a faithfully-rounded implementation with error < 1 ulp, the stated error bound should hold and it did for the few libraries I tried. For improved reproducability, I have included my own portable faithfully-rounded implementation, my_logf().

#include <math.h>
float my_logf (float);

/* compute inverse error functions with maximum error of 2.35793 ulp */
float my_erfinvf (float a)
{
    float p, r, t;
    t = fmaf (a, 0.0f - a, 1.0f);
    t = my_logf (t);
    if (fabsf(t) > 6.125f) { // maximum ulp error = 2.35793
        p =              3.03697567e-10f; //  0x1.4deb44p-32 
        p = fmaf (p, t,  2.93243101e-8f); //  0x1.f7c9aep-26 
        p = fmaf (p, t,  1.22150334e-6f); //  0x1.47e512p-20 
        p = fmaf (p, t,  2.84108955e-5f); //  0x1.dca7dep-16 
        p = fmaf (p, t,  3.93552968e-4f); //  0x1.9cab92p-12 
        p = fmaf (p, t,  3.02698812e-3f); //  0x1.8cc0dep-9 
        p = fmaf (p, t,  4.83185798e-3f); //  0x1.3ca920p-8 
        p = fmaf (p, t, -2.64646143e-1f); // -0x1.0eff66p-2 
        p = fmaf (p, t,  8.40016484e-1f); //  0x1.ae16a4p-1 
    } else { // maximum ulp error = 2.35456
        p =              5.43877832e-9f;  //  0x1.75c000p-28 
        p = fmaf (p, t,  1.43286059e-7f); //  0x1.33b458p-23 
        p = fmaf (p, t,  1.22775396e-6f); //  0x1.49929cp-20 
        p = fmaf (p, t,  1.12962631e-7f); //  0x1.e52bbap-24 
        p = fmaf (p, t, -5.61531961e-5f); // -0x1.d70c12p-15 
        p = fmaf (p, t, -1.47697705e-4f); // -0x1.35be9ap-13 
        p = fmaf (p, t,  2.31468701e-3f); //  0x1.2f6402p-9 
        p = fmaf (p, t,  1.15392562e-2f); //  0x1.7a1e4cp-7 
        p = fmaf (p, t, -2.32015476e-1f); // -0x1.db2aeep-3 
        p = fmaf (p, t,  8.86226892e-1f); //  0x1.c5bf88p-1 
    }
    r = a * p;
    return r;
}

/* compute natural logarithm with a maximum error of 0.85089 ulp */
float my_logf (float a)
{
    float i, m, r, s, t;
    int e;

    m = frexpf (a, &e);
    if (m < 0.666666667f) { // 0x1.555556p-1
        m = m + m;
        e = e - 1;
    }
    i = (float)e;
    /* m in [2/3, 4/3] */
    m = m - 1.0f;
    s = m * m;
    /* Compute log1p(m) for m in [-1/3, 1/3] */
    r =             -0.130310059f;  // -0x1.0ae000p-3
    t =              0.140869141f;  //  0x1.208000p-3
    r = fmaf (r, s, -0.121484190f); // -0x1.f19968p-4
    t = fmaf (t, s,  0.139814854f); //  0x1.1e5740p-3
    r = fmaf (r, s, -0.166846052f); // -0x1.55b362p-3
    t = fmaf (t, s,  0.200120345f); //  0x1.99d8b2p-3
    r = fmaf (r, s, -0.249996200f); // -0x1.fffe02p-3
    r = fmaf (t, m, r);
    r = fmaf (r, m,  0.333331972f); //  0x1.5554fap-2
    r = fmaf (r, m, -0.500000000f); // -0x1.000000p-1
    r = fmaf (r, s, m);
    r = fmaf (i,  0.693147182f, r); //  0x1.62e430p-1 // log(2)
    if (!((a > 0.0f) && (a <= 3.40282346e+38f))) { // 0x1.fffffep+127
        r = a + a;  // silence NaNs if necessary
        if (a  < 0.0f) r = ( 0.0f / 0.0f); //  NaN
        if (a == 0.0f) r = (-1.0f / 0.0f); // -Inf
    }
    return r;
}

I don't think it's a standard implementation in <math.h>, but there are other C math libraries that have implement the inverse error function erfinv(x), that you can use.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!