问题
I was playing around with generating Hamming numbers in Haskell, trying to improve on the obvious (pardon the naming of the functions)
mergeUniq :: Ord a => [a] -> [a] -> [a]
mergeUniq (x:xs) (y:ys) = case x `compare` y of
EQ -> x : mergeUniq xs ys
LT -> x : mergeUniq xs (y:ys)
GT -> y : mergeUniq (x:xs) ys
powers :: [Integer]
powers = 1 : expand 2 `mergeUniq` expand 3 `mergeUniq` expand 5
where
expand factor = (factor *) <$> powers
I noticed that I can avoid the (slower) arbitrary precision Integer
if I represent the numbers as the triple of the 2-, 3- and 5-exponents like data Power = Power { k2 :: !Int, k3 :: !Int, k5 :: !Int }
, where the number is understood to be 2k2 * 3k3 * 5k5
. The comparison of two Power
s then becomes
instance Ord Power where
p1 `compare` p2 = toComp (p1 `divP` gcdP) `compare` toComp (p2 `divP` gcdP)
where
divP p1 p2 = Power { k2 = k2 p1 - k2 p2, k3 = k3 p1 - k3 p2, k5 = k5 p1 - k5 p2 }
gcdP = Power { k2 = min (k2 p1) (k2 p2), k3 = min (k3 p1) (k3 p2), k5 = min (k5 p1) (k5 p2) }
toComp Power { .. } = fromIntegral k2 * log 2 + fromIntegral k3 * log 3 + fromIntegral k5 * log 5
So, very roughly speaking, to compare p₁ = 2i₁ * 3j₁ * 5k₁
and p₂ = 2i₂ * 3j₂ * 5k₂
we compare the logarithms of p₁
and p₂
, which presumably fit Double
. But actually we do even better: we first compute their GCD (by finding the min
s of the corresponding exponents pairs — only Int
arithmetic so far!), divide p₁
and p₂
by the GCD (by subtracting the min
s from the corresponding exponents — also only Int
arithmetic), and compare the logarithms of the results.
But, given that we go through Double
s, there will be loss of precision eventually. And this is the ground for my questions:
- When will the finite precision of
Double
s bite me? That is, how to estimate the order ofi, j, k
for which the results of comparisons of2i * 3j * 5k
with numbers with "similar" exponents will become unreliable? - How does the fact that we go through dividing by the GCD (which presumably lowers the exponents considerably for this task) modify the answer to the previous question?
I did an experiment, comparing the numbers produced this way with the numbers produced via going through arbitrary precision arithmetic, and all Hamming numbers up to the 1'000'000'000th match exactly (which took me about 15 minutes and 600 megs of RAM to verify). But that's obviously not a proof.
回答1:
Empirically, it's above about 10 trillionths Hamming number, or higher.
Using your nice GCD trick won't help us here, because some neighboring Hamming numbers are bound to have no common factors between them.
update: trying it online on ideone and elsewhere, we get
4T 5.81s 22.2MB -- 16 digits used.... still good
-- (as evidenced by the `True` below), but really pushing it.
((True,44531.6794,7.275957614183426e-11),(16348,16503,873),"2.3509E+13405")
-- isTruly max min logval nth-Hamming approx.
-- Sorted logval difference as i,j,k value
-- in band in band in decimal
10T 11.13s 26.4MB
((True,60439.6639,7.275957614183426e-11),(18187,23771,1971),"1.4182E+18194")
13T 14.44s 30.4MB ...still good
((True,65963.6432,5.820766091346741e-11),(28648,21308,1526),"1.0845E+19857")
---- same code on tio:
10T 16.77s
35T 38.84s
((True,91766.4800,5.820766091346741e-11),(13824,2133,32112),"2.9045E+27624")
70T 59.57s
((True,115619.1575,5.820766091346741e-11),(13125,13687,34799),"6.8310E+34804")
---- on home machine:
100T: 368.13s
((True,130216.1408,5.820766091346741e-11),(88324,876,17444),"9.2111E+39198")
140T: 466.69s
((True,145671.6480,5.820766091346741e-11),(9918,24002,42082),"3.4322E+43851")
170T: 383.26s ---FAULTY---
((False,155411.2501,0.0),(77201,27980,14584),"2.80508E+46783")
回答2:
I guess that you could use adaptive arbitrary precision to compute the log.
If you choose log base 2, then log2(2^i)
is trivial. That eliminates 1 factor and log2 has the advantage of being easier to compute than natural logarithm (https://en.wikipedia.org/wiki/Binary_logarithm gives an algorithm for example, there is also Shanks...).
For log2(3) and log2(5), you would develop just enough terms to distinguish both operands. I don't know if it would lead to more operations than directly exponentiating 3^j and 5^k in large integer arithmetic and counting high bit... But those could be pre-tabulated up to required number of digits.
来源:https://stackoverflow.com/questions/60803224/hamming-numbers-and-double-precision