Multiplying two polynomials mod n,x^r-1 using long integers: what is the correct window size?

问题

Using a window multiplication algorithm to multiply two polynomials[coefficients in Z/nZ and the whole polynomial mod x^r-1 for some r] using long-integer multiplications, what size should I give to the window?

Where by "window" I mean the bit-length that the coefficients should use in the initial long-integers such that the result of the long-integer multiplication contains the correct coefficients of the result[the sums of the coefficients "don't overlap"].

At the beginning I thought that ceil(log(n**2,2)) + 1 would be enough, because each coefficient is at most n-1 so a product of these coefficients is at most (n-1)**2. Then I realized that when doing a long-integer multiplication there can be some sums of these coefficients, thus the window should be ceil(log(number-of-additions * n**2,2)) + 1.

I thought that there could be at most the sum of the degrees of the two polynomials additions, but using ceil(log((deg_a + deg_b + 1) * n**2,2)) +1 works for some time, but eventually the coefficients overlap and I get incorrect results.

So how big should this "window" be?

By the way, here's the current (python) code:

def __mul__(self, other):
    new = ModPolynomial(self._mod_r, self._mod_n)
    #new = x mod self._mod_n,x^(self._mod_r -1)
    try:
        new_deg = self._degree + other.degree
        new_coefs = []
        # i've also tried with (new_deg + 1) ** 2 * ...
        window = int(math.ceil(math.log((new_deg + 1) * self._mod_n**2,2))) + 1
        A = 0
        for i,k in enumerate(self._coefs):
            A += k << (window * i)
        B = 0
        for i,k in enumerate(other.coefficients):
            B += k << (window * i)

        res = A * B
        mask = 2**window - 1
        while res:
            new_coefs.append((res & mask) % self._mod_n)
            res >>= window
        new._coefs = new_coefs
        new._degree = self._finddegree(new_coefs)
    except AttributeError:
        new._coefs = [(c * other) % self._mod_n for c in self._coefs]
        new._degree = self._finddegree(new._coefs)
    new._mod()
    return new

edit 1: I'm starting to think that the window size may not be the problem. I've tried to increase it up to ceil(log2((new_deg+1)**3 * self._mod_n ** 5))+1 and this yields the same results as using ceil(log2((new_deg+1) * self._mod_n**2))+1, and since the difference between these two size is really big[about 55-60 bits difference in my tests, which is a lot if you think...],this means that probably the smallest of these size if already okay, but there is some other problem somewhere.

edit 2: An example of wrong result is:

#ModPolynomial(r,n) = x mod n, x^r-1
>>> pol = polys.ModPolynomial(20,100)      # uses integer-multiplication
>>> pol += 2
>>> pol2 = polynomials.ModPolynomial(20,100)   #uses the naive algorithm
>>> pol2 += 2
>>> (pol2**52).coefficients      #should be the correct result[first is coef of x^0]
(76, 76, 44, 0, 0, 16, 16, 4, 0, 0, 24, 24, 81, 0, 0, 80, 80, 20)
>>> (pol**52).coefficients       #the wrong result
(12L, 52L, 8L, 20L, 60L, 12L, 92L, 28L, 60L, 80L, 68L, 48L, 22L, 0L, 0L, 20L, 20L, 80L)

I'll try to find some smaller example, so that I can verify it by hand.

edit 3: Okay, I found out that there is some problem with the degree. I found an example in which the degree becomes negative, which obviously shouldn't be. So i'll try to dig more checking when and why the degree changes in this unexpected way.

edit 4: I've found the bug. When creating the integer I was iterating over all the _coefs sequence, but my implementation does not guarantee that all coefficients that correspond to a degree > of the polynomial degree are 0. This fixes the issue.

edit 5: Just some performance results I've obtained testing this implementation.

1) Using long-integer multiplication is faster than using numpy.convolve iff the coefficients are bigger than ints. Otherwise numpy.convolve is faster.

2) About 80% of the time is spent converting lists of coefficients to integers and integers to lists of coefficients. There is not much you can do about this, since these operations are O(n).

Now i'm wondering if there is a way to implement the "mod x^r-1" operation in an efficient way using only long-integers... this could probably give a big speed-up, since at that point you don't have to do the conversions anymore.

回答1:

The correct calculation is

window = int(math.ceil(math.log((max(self._degree, other.degree) + 1) *
                                (self._mod_n - 1)**2, 2)))

However this will definitely be less than the window you calculated, so there must be some other reason you're getting incorrect results. Are you sure the degree is being calculated correctly? Can you give an example of a polynomial that is calculated incorrectly?

Unless there's a particularly good reason to use long integer multiplication, I'd recommend using NumPy:

new.coeffs = np.convolve(self.coeffs, other.coeffs) % self.mod

This will usually be at least as efficient as long integer multiplication (which is a form of convolution anyway), and you've got a lot less Python code to worry about. In fact NumPy has a polynomial library; although it's designed for floating-point coefficients, you can look at it to get an idea how to implement your code efficiently.

来源：https://stackoverflow.com/questions/12060766/multiplying-two-polynomials-mod-n-xr-1-using-long-integers-what-is-the-correct

标签

python

long-integer

multiplication

polynomial-math