This is a program I wrote to calculate Pythagorean triplets. When I run the program it prints each set of triplets twice because of the if statement. Is there any way I can
I juste extended Kyle Gullion 's answer so that triples are sorted by hypothenuse, then longest side.
It doesn't use numpy, but requires a SortedCollection (or SortedList) such as this one
def primitive_triples():
""" generates primitive Pythagorean triplets x<y<z
sorted by hypotenuse z, then longest side y
through Berggren's matrices and breadth first traversal of ternary tree
:see: https://en.wikipedia.org/wiki/Tree_of_primitive_Pythagorean_triples
"""
key=lambda x:(x[2],x[1])
triples=SortedCollection(key=key)
triples.insert([3,4,5])
A = [[ 1,-2, 2], [ 2,-1, 2], [ 2,-2, 3]]
B = [[ 1, 2, 2], [ 2, 1, 2], [ 2, 2, 3]]
C = [[-1, 2, 2], [-2, 1, 2], [-2, 2, 3]]
while triples:
(a,b,c) = triples.pop(0)
yield (a,b,c)
# expand this triple to 3 new triples using Berggren's matrices
for X in [A,B,C]:
triple=[sum(x*y for (x,y) in zip([a,b,c],X[i])) for i in range(3)]
if triple[0]>triple[1]: # ensure x<y<z
triple[0],triple[1]=triple[1],triple[0]
triples.insert(triple)
def triples():
""" generates all Pythagorean triplets triplets x<y<z
sorted by hypotenuse z, then longest side y
"""
prim=[] #list of primitive triples up to now
key=lambda x:(x[2],x[1])
samez=SortedCollection(key=key) # temp triplets with same z
buffer=SortedCollection(key=key) # temp for triplets with smaller z
for pt in primitive_triples():
z=pt[2]
if samez and z!=samez[0][2]: #flush samez
while samez:
yield samez.pop(0)
samez.insert(pt)
#build buffer of smaller multiples of the primitives already found
for i,pm in enumerate(prim):
p,m=pm[0:2]
while True:
mz=m*p[2]
if mz < z:
buffer.insert(tuple(m*x for x in p))
elif mz == z:
# we need another buffer because next pt might have
# the same z as the previous one, but a smaller y than
# a multiple of a previous pt ...
samez.insert(tuple(m*x for x in p))
else:
break
m+=1
prim[i][1]=m #update multiplier for next loops
while buffer: #flush buffer
yield buffer.pop(0)
prim.append([pt,2]) #add primitive to the list
the code is available in the math2 module of my Python library. It is tested against some series of the OEIS (code here at the bottom), which just enabled me to find a mistake in A121727 :-)
A non-numpy version of the Hall/Roberts approach is
def pythag3(limit=None, all=False):
"""generate Pythagorean triples which are primitive (default)
or without restriction (when ``all`` is True). The elements
returned in the tuples are sorted with the smallest first.
Examples
========
>>> list(pythag3(20))
[(3, 4, 5), (8, 15, 17), (5, 12, 13)]
>>> list(pythag3(20, True))
[(3, 4, 5), (6, 8, 10), (9, 12, 15), (12, 16, 20), (8, 15, 17), (5, 12, 13)]
"""
if limit and limit < 5:
return
m = [(3,4,5)] # primitives stored here
while m:
x, y, z = m.pop()
if x > y:
x, y = y, x
yield (x, y, z)
if all:
a, b, c = x, y, z
while 1:
c += z
if c > limit:
break
a += x
b += y
yield a, b, c
# new primitives
a = x - 2*y + 2*z, 2*x - y + 2*z, 2*x - 2*y + 3*z
b = x + 2*y + 2*z, 2*x + y + 2*z, 2*x + 2*y + 3*z
c = -x + 2*y + 2*z, -2*x + y + 2*z, -2*x + 2*y + 3*z
for d in (a, b, c):
if d[2] <= limit:
m.append(d)
It's slower than the numpy-coded version but the primitives with largest element less than or equal to 10^6
are generated on my slow machine in about 1.4 seconds. (And the list m
never grew beyond 18 elements.)
You can try this
triplets=[]
for a in range(1,100):
for b in range(1,100):
for c in range(1,100):
if a**2 + b**2==c**2:
i=[a,b,c]
triplets.append(i)
for i in triplets:
i.sort()
if triplets.count(i)>1:
triplets.remove(i)
print(triplets)
Substantially faster than any of the solutions so far. Finds triplets via a ternary tree.
Wolfram says:
Hall (1970) and Roberts (1977) prove that is a primitive Pythagorean triple if and only if
(a,b,c)=(3,4,5)M
where M is a finite product of the matrices U,A,D.
And there we have a formula to generate every primitive triple.
In the above formula, the hypotenuse is ever growing so it's pretty easy to check for a max length.
In Python:
import numpy as np
def gen_prim_pyth_trips(limit=None):
u = np.mat(' 1 2 2; -2 -1 -2; 2 2 3')
a = np.mat(' 1 2 2; 2 1 2; 2 2 3')
d = np.mat('-1 -2 -2; 2 1 2; 2 2 3')
uad = np.array([u, a, d])
m = np.array([3, 4, 5])
while m.size:
m = m.reshape(-1, 3)
if limit:
m = m[m[:, 2] <= limit]
yield from m
m = np.dot(m, uad)
If you'd like all triples and not just the primitives:
def gen_all_pyth_trips(limit):
for prim in gen_prim_pyth_trips(limit):
i = prim
for _ in range(limit//prim[2]):
yield i
i = i + prim
list(gen_prim_pyth_trips(10**4))
took 2.81 milliseconds to come back with 1593 elements while list(gen_all_pyth_trips(10**4))
took 19.8 milliseconds to come back with 12471 elements
For reference, the accepted answer (in python) took 38 seconds for 12471 elements.
Just for fun, setting the upper limit to one million list(gen_all_pyth_trips(10**6))
returns in 2.66 seconds with 1980642 elements (almost 2 million triples in 3 seconds). list(gen_all_pyth_trips(10**7))
brings my computer to its knees as the list gets so large it consumes every last bit of ram. Doing something like sum(1 for _ in gen_all_pyth_trips(10**7))
gets around that limitation and returns in 30 seconds with 23471475 elements.
For more information on the algorithm used, check out the articles on Wolfram and Wikipedia.
def pyth_triplets(n=1000):
"Version 1"
for x in xrange(1, n):
x2= x*x # time saver
for y in xrange(x+1, n): # y > x
z2= x2 + y*y
zs= int(z2**.5)
if zs*zs == z2:
yield x, y, zs
>>> print list(pyth_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
V.1 algorithm has monotonically increasing x
values.
It seems this question is still alive :)
Since I came back and revisited the code, I tried a second approach which is almost 4 times as fast (about 26% of CPU time for N=10000) as my previous suggestion since it avoids lots of unnecessary calculations:
def pyth_triplets(n=1000):
"Version 2"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1
y= z - 1; y2= y*y
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2= x*x
y-= 1; y2= y*y
elif x2_y2 < z2:
x+= 1; x2= x*x
else:
y-= 1; y2= y*y
>>> print list(pyth_triplets(20))
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20)]
Note that this algorithm has increasing z
values.
If the algorithm was converted to C —where, being closer to the metal, multiplications take more time than additions— one could minimalise the necessary multiplications, given the fact that the step between consecutive squares is:
(x+1)² - x² = (x+1)(x+1) - x² = x² + 2x + 1 - x² = 2x + 1
so all of the inner x2= x*x
and y2= y*y
would be converted to additions and subtractions like this:
def pyth_triplets(n=1000):
"Version 3"
for z in xrange(5, n+1):
z2= z*z # time saver
x= x2= 1; xstep= 3
y= z - 1; y2= y*y; ystep= 2*y - 1
while x < y:
x2_y2= x2 + y2
if x2_y2 == z2:
yield x, y, z
x+= 1; x2+= xstep; xstep+= 2
y-= 1; y2-= ystep; ystep-= 2
elif x2_y2 < z2:
x+= 1; x2+= xstep; xstep+= 2
else:
y-= 1; y2-= ystep; ystep-= 2
Of course, in Python the extra bytecode produced actually slows down the algorithm compared to version 2, but I would bet (without checking :) that V.3 is faster in C.
Cheers everyone :)
Algorithms can be tuned for speed, memory usage, simplicity, and other things.
Here is a pythagore_triplets
algorithm tuned for speed, at the cost of memory usage and simplicity. If all you want is speed, this could be the way to go.
Calculation of list(pythagore_triplets(10000))
takes 40 seconds on my computer, versus 63 seconds for ΤΖΩΤΖΙΟΥ's algorithm, and possibly days of calculation for Tafkas's algorithm (and all other algorithms which use 3 embedded loops instead of just 2).
def pythagore_triplets(n=1000):
maxn=int(n*(2**0.5))+1 # max int whose square may be the sum of two squares
squares=[x*x for x in xrange(maxn+1)] # calculate all the squares once
reverse_squares=dict([(squares[i],i) for i in xrange(maxn+1)]) # x*x=>x
for x in xrange(1,n):
x2 = squares[x]
for y in xrange(x,n+1):
y2 = squares[y]
z = reverse_squares.get(x2+y2)
if z != None:
yield x,y,z
>>> print list(pythagore_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]
Note that if you are going to calculate the first billion triplets, then this algorithm will crash before it even starts, because of an out of memory error. So ΤΖΩΤΖΙΟΥ's algorithm is probably a safer choice for high values of n.
BTW, here is Tafkas's algorithm, translated into python for the purpose of my performance tests. Its flaw is to require 3 loops instead of 2.
def gcd(a, b):
while b != 0:
t = b
b = a%b
a = t
return a
def find_triple(upper_boundary=1000):
for c in xrange(5,upper_boundary+1):
for b in xrange(4,c):
for a in xrange(3,b):
if (a*a + b*b == c*c and gcd(a,b) == 1):
yield a,b,c