I was trying to solve the 3 sum problem in cpp.
Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the
You can be in O(n²)
with something like:
std::vector<std::vector<int>> threeSum(std::vector<int>& nums) {
std::sort(nums.begin(), nums.end());
std::vector<std::vector<int>> res;
for (auto it = nums.begin(); it != nums.end(); ++it) {
auto left = it + 1;
auto right = nums.rbegin();
while (left < right.base()) {
auto sum = *it + *left + *right;
if (sum < 0) {
++left;
} else if (sum > 0) {
++right;
} else {
res.push_back({*it, *left, *right});
std::cout << *it << " " << *left << " " << *right << std::endl;
++left;
++right;
}
}
}
return res;
}
Demo
I let duplicate handling as exercise.
Here is my solution that finds all unique triplets in O(n^2) run-time.
class Solution {
public: vector<vector<int>> threeSum(vector<int>& nums) {
int len = nums.size();
if(len<3) return {};
sort(nums.begin(), nums.end());
vector<vector<int>> retVector;
int target, begin, end;
int i=0;
while(i < len - 2)
{
int dup; // to find duplicates entries
target = -nums[i];
begin = i + 1; end = len - 1;
while (begin < end)
{
if (nums[begin] + nums[end] < target) begin++;
else if (nums[begin] + nums[end] > target) end--;
else
{
retVector.push_back({nums[i], nums[begin], nums[end]});
// its time to remove duplicates
dup=nums[begin];
do begin++; while(nums[begin] == dup); // removing from front
dup=nums[end];
do end--; while(nums[end] == dup); // removing from back
}
}
dup=nums[i];
do i++; while(nums[i] == dup) ; // removing all ertries same as nums[i]
}
return retVector;
}
};
The source of extra complexity is the third loop, which brings time complexity of your code to O(n3).
Key observation here is that once you have two numbers, the third number is fixed, so you do not need to loop around to find it: use hash table to see if it's there or not in O(1). For example, if your first loop looks at value 56 and your second loop looks at value -96, the third value must be 40 in order to yield zero total.
If the range of numbers is reasonably small (say, -10000..10000) you can use an array instead.
This would bring time complexity to O(n2), which should be a noticeable improvement on timing.
A couple of possibilities:
First, construct a hash table of all entries in the vector up front, then remove the third loop. Inside the second loop, simply check whether -nums[i] - nums[j]
exists in the hash table. That should bring your time complexity from O(n3)
back to something closer to O(n2)
.
Second, function calls aren't free though an optimiser can sometimes improve that considerably. There's no performance reason why you should be calling a function to check if three numbers add to zero so you could replace:
if (sumToZero(i, j, k, nums)) {
with:
if (nums[i] + nums[j] == -nums[k]) {
Of course, this is rendered moot if you adopt the first suggestion.
Third, don't check and insert the possible result every time you get one. Just add it to the vector no matter what. Then, at the end, sort the vector and remove any duplicates. That should hopefully speed things up a bit as well.
Fourth, there's quite possibly a performance hit for using a vector for the potential result when an int[3]
would do just as well. Vectors are ideal if you need something with a variable size but, if both the minimum and maximum size of an array-type collection is always going to be a fixed value, raw arrays are fine.
But perhaps the most important advice is measure, don't guess!
Make sure that, after each attempted optimisation, you test to see whether it had a detrimental, negligible, or beneficial effect. A test suite of various data sets, and automating the process, will make this much easier. But, even if you have to do it manually, do so - you can't improve what you can't measure.