I\'d like to reorder the items in a vector, using another vector to specify the order:
char A[] = { \'a\', \'b\', \'c\' };
size_t ORDER[] = { 1, 0, 2 }
I came up with this solution which has the space complexity of O(max_val - min_val + 1)
, but it can be integrated with std::sort
and benefits from std::sort
's O(n log n)
decent time complexity.
std::vector<int32_t> dense_vec = {1, 2, 3};
std::vector<int32_t> order = {1, 0, 2};
int32_t max_val = *std::max_element(dense_vec.begin(), dense_vec.end());
std::vector<int32_t> sparse_vec(max_val + 1);
int32_t i = 0;
for(int32_t j: dense_vec)
{
sparse_vec[j] = order[i];
i++;
}
std::sort(dense_vec.begin(), dense_vec.end(),
[&sparse_vec](int32_t i1, int32_t i2) {return sparse_vec[i1] < sparse_vec[i2];});
The following assumptions made while writing this code:
std::sort
Never prematurely optimize. Meassure and then determine where you need to optimize and what. You can end with complex code that is hard to maintain and bug-prone in many places where performance is not an issue.
With that being said, do not early pessimize. Without changing the code you can remove half of your copies:
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
std::vector<T> tmp; // create an empty vector
tmp.reserve( data.size() ); // ensure memory and avoid moves in the vector
for ( std::size_t i = 0; i < order.size(); ++i ) {
tmp.push_back( data[order[i]] );
}
data.swap( tmp ); // swap vector contents
}
This code creates and empty (big enough) vector in which a single copy is performed in-order. At the end, the ordered and original vectors are swapped. This will reduce the copies, but still requires extra memory.
If you want to perform the moves in-place, a simple algorithm could be:
template <typename T>
void reorder( std::vector<T> & data, std::vector<std::size_t> const & order )
{
for ( std::size_t i = 0; i < order.size(); ++i ) {
std::size_t original = order[i];
while ( i < original ) {
original = order[original];
}
std::swap( data[i], data[original] );
}
}
This code should be checked and debugged. In plain words the algorithm in each step positions the element at the i-th position. First we determine where the original element for that position is now placed in the data vector. If the original position has already been touched by the algorithm (it is before the i-th position) then the original element was swapped to order[original] position. Then again, that element can already have been moved...
This algorithm is roughly O(N^2) in the number of integer operations and thus is theoretically worse in performance time as compare to the initial O(N) algorithm. But it can compensate if the N^2 swap operations (worst case) cost less than the N copy operations or if you are really constrained by memory footprint.
This should avoid copying the vector:
void REORDER(vector<char>& vA, const vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
for(int i = 0; i < vOrder.size(); ++i)
if (i < vOrder[i])
swap(vA[i], vA[vOrder[i]]);
}
It appears to me that vOrder contains a set of indexes in the desired order (for example the output of sorting by index). The code example here follows the "cycles" in vOrder, where following a sub-set (could be all of vOrder) of indexes will cycle through the sub-set, ending back at the first index of the sub-set.
Wiki article on "cycles"
https://en.wikipedia.org/wiki/Cyclic_permutation
In the following example, every swap places at least one element in it's proper place. This code example effectively reorders vA according to vOrder, while "unordering" or "unpermuting" vOrder back to its original state (0 :: n-1). If vA contained the values 0 through n-1 in order, then after reorder, vA would end up where vOrder started.
template <class T>
void reorder(vector<T>& vA, vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
// for all elements to put in place
for( size_t i = 0; i < vA.size(); ++i )
{
// while vOrder[i] is not yet in place
// every swap places at least one element in it's proper place
while( vOrder[i] != vOrder[vOrder[i]] )
{
swap( vA[vOrder[i]], vA[vOrder[vOrder[i]]] );
swap( vOrder[i], vOrder[vOrder[i]] );
}
}
}
This can also be implemented a bit more efficiently using moves instead swaps. A temp object is needed to hold an element during the moves. Example C code, reorders A[] according to indexes in I[], also sorts I[] :
void reorder(int *A, int *I, int n)
{
int i, j, k;
int tA;
/* reorder A according to I */
/* every move puts an element into place */
/* time complexity is O(n) */
for(i = 0; i < n; i++){
if(i != I[i]){
tA = A[i];
j = i;
while(i != (k = I[j])){
A[j] = A[k];
I[j] = j;
j = k;
}
A[j] = tA;
I[j] = j;
}
}
}
Here is the correct code
void REORDER(vector<char>& vA, vector<size_t>& vOrder)
{
assert(vA.size() == vOrder.size());
// for all elements to put in place
for( int i = 0; i < va.size() - 1; ++i )
{
// while the element i is not yet in place
while( i != vOrder[i] )
{
// swap it with the element at its final place
int alt = vOrder[i];
swap( vA[i], vA[alt] );
swap( vOrder[i], vOrder[alt] );
}
}
}
note that you can save one test because if n-1 elements are in place the last nth element is certainly in place.
On exit vA and vOrder are properly ordered.
This algorithm performs at most n-1 swapping because each swap moves the element to its final position. And we'll have to do at most 2N tests on vOrder.
You ask if there is "a more efficient way". But what do you mean by efficient and what are your requirements?
Potatoswatter's answer works in O(N²) time with O(1) additional space and doesn't mutate the reordering vector.
chmike and rcgldr give answers which use O(N) time with O(1) additional space, but they achieve this by mutating the reordering vector.
Your original answer allocates new space and then copies data into it while Tim MB suggests using move semantics. However, moving still requires a place to move things to and an object like an std::string
has both a length variable and a pointer. In other words, a move-based solution requires O(N)
allocations for any objects and O(1)
allocations for the new vector itself. I explain why this is important below.
We might want that reordering vector! Sorting costs O(N log N). But, if you know you'll be sorting several vectors in the same way, such as in a Structure of Arrays (SoA) context, you can sort once and then reuse the results. This can save a lot of time.
You might also want to sort and then unsort data. Having the reordering vector allows you to do this. A use case here is for performing genomic sequencing on GPUs where maximal speed efficiency is obtained by having sequences of similar lengths processed in batches. We cannot rely on the user providing sequences in this order so we sort and then unsort.
So, what if we want the best of all worlds: O(N) processing without the costs of additional allocation but also without mutating our ordering vector (which we might, after all, want to reuse)? To find that world, we need to ask:
There are two reasons you might not want to allocate additional space.
The first is that you don't have much space to work with. This can occur in two situations: you're on an embedded device with limited memory. Usually this means you're working with small datasets, so the O(N²) solution is probably fine here. But it can also happen when you are working with really large datasets. In this case O(N²) is unacceptable and you have to use one of the O(N) mutating solutions.
The other reason extra space is bad is because allocation is expensive. For smaller datasets it can cost more than the actual computation. Thus, one way to achieve efficiency is to eliminate allocation.
When we mutate the ordering vector we are doing so as a way to indicate whether elements are in their permuted positions. Rather than doing this, we could use a bit-vector to indicate that same information. However, if we allocate the bit vector each time that would be expensive.
Instead, we could clear the bit vector each time by resetting it to zero. However, that incurs an additional O(N) cost per function use.
Rather, we can store a "version" value in a vector and increment this on each function use. This gives us O(1) access, O(1) clear, and an amoritzed allocation cost. This works similarly to a persistent data structure. The downside is that if we use an ordering function too often the version counter needs to be reset, though the O(N) cost of doing so is amortized.
This raises the question: what is the optimal data type for the version vector? A bit-vector maximizes cache utilization but requires a full O(N) reset after each use. A 64-bit data type probably never needs to be reset, but has poor cache utilization. Experimenting is the best way to figure this out.
We can view an ordering vector as having two senses: forward and backward. In the forward sense, the vector tell us where elements go to. In the backward sense, the vector tells us where elements are coming from. Since the ordering vector is implicitly a linked list, the backward sense requires O(N)
additional space, but, again, we can amortize the allocation cost. Applying the two senses sequentially brings us back to our original ordering.
Running single-threaded on my "Intel(R) Xeon(R) E-2176M CPU @ 2.70GHz", the following code takes about 0.81ms per reordering for sequences 32,767 elements long.
Fully commented code for both senses with tests:
#include <algorithm>
#include <cassert>
#include <random>
#include <stack>
#include <stdexcept>
#include <vector>
///@brief Reorder a vector by moving its elements to indices indicted by another
/// vector. Takes O(N) time and O(N) space. Allocations are amoritzed.
///
///@param[in,out] values Vector to be reordered
///@param[in] ordering A permutation of the vector
///@param[in,out] visited A black-box vector to be reused between calls and
/// shared with with `backward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void forward_reorder(
std::vector<ValueType> &values,
const std::vector<OrderingType> &ordering,
std::vector<ProgressType> &visited
){
if(ordering.size()!=values.size()){
throw std::runtime_error("ordering and values must be the same size!");
}
//Size the visited vector appropriately. Since vectors don't shrink, this will
//shortly become large enough to handle most of the inputs. The vector is 1
//larger than necessary because the first element is special.
if(visited.empty() || visited.size()-1<values.size());
visited.resize(values.size()+1);
//If the visitation indicator becomes too large, we reset everything. This is
//O(N) expensive, but unlikely to occur in most use cases if an appropriate
//data type is chosen for the visited vector. For instance, an unsigned 32-bit
//integer provides ~4B uses before it needs to be reset. We subtract one below
//to avoid having to think too much about off-by-one errors. Note that
//choosing the biggest data type possible is not necessarily a good idea!
//Smaller data types will have better cache utilization.
if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
std::fill(visited.begin(), visited.end(), 0);
//We increment the stored visited indicator and make a note of the result. Any
//value in the visited vector less than `visited_indicator` has not been
//visited.
const auto visited_indicator = ++visited.at(0);
//For doing an early exit if we get everything in place
auto remaining = values.size();
//For all elements that need to be placed
for(size_t s=0;s<ordering.size() && remaining>0;s++){
assert(visited[s+1]<=visited_indicator);
//Ignore already-visited elements
if(visited[s+1]==visited_indicator)
continue;
//Don't rearrange if we don't have to
if(s==visited[s])
continue;
//Follow this cycle, putting elements in their places until we get back
//around. Use move semantics for speed.
auto temp = std::move(values[s]);
auto i = s;
for(;s!=(size_t)ordering[i];i=ordering[i],--remaining){
std::swap(temp, values[ordering[i]]);
visited[i+1] = visited_indicator;
}
std::swap(temp, values[s]);
visited[i+1] = visited_indicator;
}
}
///@brief Reorder a vector by moving its elements to indices indicted by another
/// vector. Takes O(2N) time and O(2N) space. Allocations are amoritzed.
///
///@param[in,out] values Vector to be reordered
///@param[in] ordering A permutation of the vector
///@param[in,out] visited A black-box vector to be reused between calls and
/// shared with with `forward_reorder()`
template<class ValueType, class OrderingType, class ProgressType>
void backward_reorder(
std::vector<ValueType> &values,
const std::vector<OrderingType> &ordering,
std::vector<ProgressType> &visited
){
//The orderings form a linked list. We need O(N) memory to reverse a linked
//list. We use `thread_local` so that the function is reentrant.
thread_local std::stack<OrderingType> stack;
if(ordering.size()!=values.size()){
throw std::runtime_error("ordering and values must be the same size!");
}
//Size the visited vector appropriately. Since vectors don't shrink, this will
//shortly become large enough to handle most of the inputs. The vector is 1
//larger than necessary because the first element is special.
if(visited.empty() || visited.size()-1<values.size());
visited.resize(values.size()+1);
//If the visitation indicator becomes too large, we reset everything. This is
//O(N) expensive, but unlikely to occur in most use cases if an appropriate
//data type is chosen for the visited vector. For instance, an unsigned 32-bit
//integer provides ~4B uses before it needs to be reset. We subtract one below
//to avoid having to think too much about off-by-one errors. Note that
//choosing the biggest data type possible is not necessarily a good idea!
//Smaller data types will have better cache utilization.
if(visited.at(0)==std::numeric_limits<ProgressType>::max()-1)
std::fill(visited.begin(), visited.end(), 0);
//We increment the stored visited indicator and make a note of the result. Any
//value in the visited vector less than `visited_indicator` has not been
//visited.
const auto visited_indicator = ++visited.at(0);
//For doing an early exit if we get everything in place
auto remaining = values.size();
//For all elements that need to be placed
for(size_t s=0;s<ordering.size() && remaining>0;s++){
assert(visited[s+1]<=visited_indicator);
//Ignore already-visited elements
if(visited[s+1]==visited_indicator)
continue;
//Don't rearrange if we don't have to
if(s==visited[s])
continue;
//The orderings form a linked list. We need to follow that list to its end
//in order to reverse it.
stack.emplace(s);
for(auto i=s;s!=(size_t)ordering[i];i=ordering[i]){
stack.emplace(ordering[i]);
}
//Now we follow the linked list in reverse to its beginning, putting
//elements in their places. Use move semantics for speed.
auto temp = std::move(values[s]);
while(!stack.empty()){
std::swap(temp, values[stack.top()]);
visited[stack.top()+1] = visited_indicator;
stack.pop();
--remaining;
}
visited[s+1] = visited_indicator;
}
}
int main(){
std::mt19937 gen;
std::uniform_int_distribution<short> value_dist(0,std::numeric_limits<short>::max());
std::uniform_int_distribution<short> len_dist (0,std::numeric_limits<short>::max());
std::vector<short> data;
std::vector<short> ordering;
std::vector<short> original;
std::vector<size_t> progress;
for(int i=0;i<1000;i++){
const int len = len_dist(gen);
data.clear();
ordering.clear();
for(int i=0;i<len;i++){
data.push_back(value_dist(gen));
ordering.push_back(i);
}
original = data;
std::shuffle(ordering.begin(), ordering.end(), gen);
forward_reorder(data, ordering, progress);
assert(original!=data);
backward_reorder(data, ordering, progress);
assert(original==data);
}
}