A taxicab number is an integer that can be expressed as the sum of two cubes of integers in two different ways: a^3+b^3 = c^3+d^3
. Design an algorithm to find all t
A easy way of understanding Time complexity O(N^2 logN), space complexity O(N) is to think it as a merge of N
sorted arrays plus a bookkeeping of the previously merged element.
But this takes N^2 space. Can we do better?
There exists an O(N) space solution based on a priority queue. Time complexity is O(N^2 logN). To sketch out the idea of the algorithm, here is the matrix M such that M[i][j] = i^3 + j^3 (of course, the matrix is never created in memory):
0 1 8 27 64 125
1 2 9 28 65 126
8 9 16 35 72 133
27 28 35 54 91 152
64 65 72 91 128 189
125 126 133 152 189 250
Observe that every line and every row is sorted in ascending order. Let PQ be the priority queue. First we put the biggest element in the priority queue. Then perform the following, as long as the PQ is not empty:
Note that
Everytime the PQ issues the same value twice then we have found a taxicab number.
As an illustration, here is an implementation in C++. The time complexity is O(N^2 logN) and space complexity O(N).
#include <iostream>
#include <cassert>
#include <queue>
using namespace std;
typedef unsigned int value_type;
struct Square
{
value_type i;
value_type j;
value_type sum_of_cubes;
Square(value_type i, value_type j) : i(i), j(j), sum_of_cubes(i*i*i+j*j*j) {}
friend class SquareCompare;
bool taxicab(const Square& sq) const
{
return sum_of_cubes == sq.sum_of_cubes && i != sq.i && i != sq.j;
}
friend ostream& operator<<(ostream& os, const Square& sq);
};
class SquareCompare
{
public:
bool operator()(const Square& a, const Square& b)
{
return a.sum_of_cubes < b.sum_of_cubes;
}
};
ostream& operator<<(ostream& os, const Square& sq)
{
return os << sq.i << "^3 + " << sq.j << "^3 = " << sq.sum_of_cubes;
}
int main()
{
const value_type N=2001;
value_type count = 0;
bool in_i [N];
bool in_j [N];
for (value_type i=0; i<N; i++) {
in_i[i] = false;
in_j[i] = false;
}
priority_queue<Square, vector<Square>, SquareCompare> p_queue;
p_queue.push(Square(N-1, N-1));
in_i[N-1] = true;
in_j[N-1] = true;
while(!p_queue.empty()) {
Square sq = p_queue.top();
p_queue.pop();
in_i[sq.i] = false;
in_j[sq.j] = false;
// cout << "pop " << sq.i << " " << sq.j << endl;
if (sq.i > 0 && !in_i[sq.i - 1] && sq.i-1 >= sq.j) {
p_queue.push(Square(sq.i-1, sq.j));
in_i[sq.i-1] = true;
in_j[sq.j] = true;
// cout << "push " << sq.i-1 << " " << sq.j << endl;
}
if (sq.j > 0 && !in_j[sq.j-1] && sq.i >= sq.j - 1) {
p_queue.push(Square(sq.i, sq.j-1));
in_i[sq.i] = true;
in_j[sq.j - 1] = true;
// cout << "push " << sq.i << " " << sq.j-1 << endl;
}
if (sq.taxicab(p_queue.top())) {
/* taxicab number */
cout << sq << " " << p_queue.top() << endl;
count++;
}
}
cout << endl;
cout << "there are " << count << " taxicab numbers with a, b, c, d < " << N << endl;
return 0;
}
The answers given by Novneet Nov and user3017842 are both correct ideas for finding the taxicab numbers with storage O(N) using minHeap. Just a little bit more explanation why the minHeap of size N works. First, if you had all the sums (O(N^2)) and could sort them (O(N^2lgN)) you would just pick the duplicates as you traverse the sorted array. Well, in our case using a minHeap we can traverse in-order all the sums: we just need to ensure that the minHeap always contains the minimum unprocessed sum.
Now, we have a huge number of sums (O(N^2)). But, notice that this number can be split into N groups each of which has an easily defined minimum!
(fix a
, change b
from 0 to N-1
=> here are your N
groups. The sum in one group with a smaller b
is smaller than one with a bigger b
in the same group - because a
is the same).
The minimum of union of these groups is in the union of mins of these groups. Therefore, if you keep all minimums of these groups in the minHeap you are guaranteed to have the total minimum in the minHeap.
Now, when you extract Min from the heap, you just add next smallest element from the group of this extracted min (so if you extracted (a, b)
you add (a, b+1)
) and you are guaranteed that your minHeap still contains the next unprocessed min of all the sums.
version1 uses List and sorting
O(n^2*logn) time and O(n^2) space
public static void Taxicab1(int n)
{
// O(n^2) time and O(n^2) space
var list = new List<int>();
for (int i = 1; i <= n; i++)
{
for (int j = i; j <= n; j++)
{
list.Add(i * i * i + j * j * j);
}
}
// O(n^2*log(n^2)) time
list.Sort();
// O(n^2) time
int prev = -1;
foreach (var next in list)
{
if (prev == next)
{
Console.WriteLine(prev);
}
prev = next;
}
}
version2 uses HashSet
O(n^2) time and O(n^2) space
public static void Taxicab2(int n)
{
// O(n^2) time and O(n^2) space
var set = new HashSet<int>();
for (int i = 1; i <= n; i++)
{
for (int j = i; j <= n; j++)
{
int x = i * i * i + j * j * j;
if (!set.Add(x))
{
Console.WriteLine(x);
}
}
}
}
version3 uses min oriented Priority Queue
O(n^2*logn) time and O(n) space
public static void Taxicab3(int n)
{
// O(n) time and O(n) space
var pq = new MinPQ<SumOfCubes>();
for (int i = 1; i <= n; i++)
{
pq.Push(new SumOfCubes(i, i));
}
// O(n^2*logn) time
var sentinel = new SumOfCubes(0, 0);
while (pq.Count > 0)
{
var current = pq.Pop();
if (current.Result == sentinel.Result)
Console.WriteLine($"{sentinel.A}^3+{sentinel.B}^3 = {current.A}^3+{current.B}^3 = {current.Result}");
if (current.B <= n)
pq.Push(new SumOfCubes(current.A, current.B + 1));
sentinel = current;
}
}
where SummOfCubes
public class SumOfCubes : IComparable<SumOfCubes>
{
public int A { get; private set; }
public int B { get; private set; }
public int Result { get; private set; }
public SumOfCubes(int a, int b)
{
A = a;
B = b;
Result = a * a * a + b * b * b;
}
public int CompareTo(SumOfCubes other)
{
return Result.CompareTo(other.Result);
}
}
github
It seems like a simple brute-force algorithm with proper bounds solves it in time proportional to n^1.33 and space proportional to n. Or could anyone point me to the place where I'm mistaken?
Consider 4 nested loops, each running from 1 to cubic root of n. Using these loops we can go over all possible combinations of 4 values and find the pairs forming taxicab numbers. It means each loop takes time proportional to cubic root of n, or n^(1/3). Multiply this value 4 times and get:
(n^(1/3)^4 = n^(4/3) = n^1.33
I wrote a solution in JavaScript and benchmarked it, and it seems to be working. One caveat is that the result is only partially sorted.
Here is my JavaScript code (it's not optimal yet, could be optimized even more):
function taxicab(n) {
let a = 1, b = 1, c = 1, d = 1,
cubeA = a**3 + b**3,
cubeB = c**3 + d**3,
results = [];
while (cubeA < n) { // loop over a
while (cubeA < n) { // loop over b
// avoid running nested loops if this number is already in results
if (results.indexOf(cubeA) === -1) {
while (cubeB <= cubeA) { // loop over c
while (cubeB <= cubeA) { // loop over d
if (cubeB === cubeA && a!=c && a!=d) { // found a taxicab number!
results.push(cubeA);
}
d++;
cubeB = c**3 + d**3;
} // end loop over d
c++;
d = c;
cubeB = c**3 + d**3;
} // end loop over c
}
b++;
cubeA = a**3 + b**3;
c = d = 1;
cubeB = c**3 + d**3;
} // end loop over d
a++;
b = a;
cubeA = a**3 + b**3;
} // end loop over a
return results;
}
Running taxicab(1E8)
takes around 30 seconds in a browser console and yields 485 numbers as a result. Ten times smaller value taxicab(1E7)
(10 millions) takes almost 1.4 seconds and yields 150 numbers. 10^1.33 * 1.4 = 29.9
, i.e. multiplying n by 10 leads to the running time increased by 10^1.33 times. The result array is unsorted, but after quickly sorting it we get correct result, as it seems:
[1729, 4104, 13832, 20683, 32832, 39312, 40033, 46683, 64232, 65728,
110656, 110808, 134379, 149389, 165464, 171288, 195841, 216027, 216125,
262656, 314496, 320264, 327763, 373464, 402597, 439101, 443889, 513000,
513856, 515375, 525824, 558441, 593047, 684019, 704977, 805688, 842751,
885248, 886464, 920673, 955016, 984067, 994688, 1009736, 1016496, 1061424,
1073375, 1075032, 1080891, 1092728, 1195112, 1260441, 1323712, 1331064,
1370304, 1407672, 1533357, 1566728, 1609272, 1728216, 1729000, 1734264,
1774656, 1845649, 2048391, 2101248, 2301299, 2418271, 2515968, 2562112,
2585375, 2622104, 2691451, 2864288, 2987712, 2991816, 3220776, 3242197,
3375001, 3375008, 3511872, 3512808, 3551112, 3587409, 3628233, 3798613,
3813992, 4033503, 4104000, 4110848, 4123000, 4174281, 4206592, 4342914,
4467528, 4505949, 4511808, 4607064, 4624776, 4673088, …]
Here is a code for benchmarking:
// run taxicab(n) for k trials and return the average running time
function benchmark(n, k) {
let t = 0;
k = k || 1; // how many times to repeat the trial to get an averaged result
for(let i = 0; i < k; i++) {
let t1 = new Date();
taxicab(n);
let t2 = new Date();
t += t2 - t1;
}
return Math.round(t/k);
}
Finally, I tested it:
let T = benchmark(1E7, 3); // 1376 - running time for n = 10 million
let T2 = benchmark(2E7, 3);// 4821 - running time for n = 20 million
let powerLaw = Math.log2(T2/T); // 1.3206693816701993
So it means time is proportional to n^1.32 in this test. Repeating this many times with different values always yields around the same result: from 1.3 to 1.4.
I found the solution/code here : Time complexity O(N^2 logN), space complexity O(N) The solution is implemented by help of priority queues.
Reverse thinking can be easily done by looking at the code. It can be done in an array of size N because the min sums are deleted from the array after comparing to the next minimum and then the array is made to size N by adding a new sum - (i^3 + (j+1)^3).
A intuitive proof is here :
Initially, we have added (1,1),(2,2),(3,3),...,(N,N) in the min-priority queue.
Suppose a^+b^3=c^3+d^3, and (a,b) is the minimum that will be taken out of the priority queue next. To be able to detect this taxicab number, (c,d) must also be in the priority queue which would be taken out after (a,b).
Note: We would be adding (a,b+1) after extracting (a,b) so there is no way that extraction of (a,b) would result in addition of (c,d) to the priority queue, so it must already exist in the priority queue.
Now lets assume that (c,d) is not in the priority queue, because we haven't gotten to it yet. Instead, there is some (c,d−k) in the priority queue where k>0.
Since (a,b) is being taken out, a^3+b^3≤c^3+(d−k)^3
However, a^3+b^3=c^3+d^3
Therefore,
c^3+d^3≤c^3+(d−k)^3 d≤d−k k≤0
Since k>0, this is impossible. Thus our assumption can never come to pass. Thus for every (a,b) which is being removed from the min-PQ, (c,d) is already in the min-PQ (or was just removed) if a^3+b^3=c^3+d^3