Converting floating point to fixed point

耗尽温柔 提交于 2019-11-29 20:28:56
Kevin

Here you go:

// A signed fixed-point 16:16 class
class FixedPoint_16_16
{
    short          intPart;
    unsigned short fracPart;

public:
    FixedPoint_16_16(double d)
    {
        *this = d; // calls operator=
    }

    FixedPoint_16_16& operator=(double d)
    {
        intPart = static_cast<short>(d);
        fracPart = static_cast<unsigned short>
                    (numeric_limits<unsigned short> + 1.0)*d);
        return *this;
    }

    // Other operators can be defined here
};

EDIT: Here's a more general class based on anothercommon way to deal with fixed-point numbers (and which KPexEA pointed out):

template <class BaseType, size_t FracDigits>
class fixed_point
{
    const static BaseType factor = 1 << FracDigits;

    BaseType data;

public:
    fixed_point(double d)
    {
        *this = d; // calls operator=
    }

    fixed_point& operator=(double d)
    {
        data = static_cast<BaseType>(d*factor);
        return *this;
    }

    BaseType raw_data() const
    {
        return data;
    }

    // Other operators can be defined here
};


fixed_point<int, 8> fp1;           // Will be signed 24:8 (if int is 32-bits)
fixed_point<unsigned int, 16> fp1; // Will be unsigned 16:16 (if int is 32-bits)

A cast from float to integer will throw away the fractional portion so if you want to keep that fraction around as fixed point then you just multiply the float before casting it. The below code will not check for overflow mind you.

If you want 16:16

double f = 1.2345;
int n;

n=(int)(f*65536);

if you want 24:8

double f = 1.2345;
int n;

n=(int)(f*256);

**** Edit** : My first comment applies to before Kevin's edit,but I'll leave it here for posterity. Answers change so quickly here sometimes!

The problem with Kevin's approach is that with Fixed Point you are normally packing into a guaranteed word size (typically 32bits). Declaring the two parts separately leaves you to the whim of your compiler's structure packing. Yes you could force it, but it does not work for anything other than 16:16 representation.

KPexEA is closer to the mark by packing everything into int - although I would use "signed long" to try and be explicit on 32bits. Then you can use his approach for generating the fixed point value, and bit slicing do extract the component parts again. His suggestion also covers the 24:8 case.

( And everyone else who suggested just static_cast.....what were you thinking? ;) )

CVertex

I gave the answer to the guy that wrote the best answer, but I really used a related questions code that points here.

It used templates and was easy to ditch dependencies on the boost lib.

This is fine for converting from floating point to integer, but the O.P. also wanted fixed point.

Now how you'd do that in C++, I don't know (C++ not being something I can think in readily). Perhaps try a scaled-integer approach, i.e. use a 32 or 64 bit integer and programmatically allocate the last, say, 6 digits to what's on the right hand side of the decimal point.

There isn't any built in support in C++ for fixed point numbers. Your best bet would be to write a wrapper 'FixedInt' class that takes doubles and converts them.

As for a generic method to convert... the int part is easy enough, just grab the integer part of the value and store it in the upper bits... decimal part would be something along the lines of:

for (int i = 1; i <= precision; i++)
{
   if (decimal_part > 1.f/(float)(i + 1)
   {
      decimal_part -= 1.f/(float)(i + 1);
      fixint_value |= (1 << precision - i);
   }
}

although this is likely to contain bugs still

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!