Algorithm to determine if array contains n…n+m?

前端未结

关注

 30  2973

I saw this question on Reddit, and there were no positive solutions presented, and I thought it would be a perfect question to ask here. This was in a thread about interview

相关标签:

30条回答

走了就别回头了

2020-11-28 02:05
Assuming you know only the length of the array and you are allowed to modify the array it can be done in O(1) space and O(n) time.

The process has two straightforward steps. 1. "modulo sort" the array. [5,3,2,4] => [4,5,2,3] (O(2n)) 2. Check that each value's neighbor is one higher than itself (modulo) (O(n))

All told you need at most 3 passes through the array.

The modulo sort is the 'tricky' part, but the objective is simple. Take each value in the array and store it at its own address (modulo length). This requires one pass through the array, looping over each location 'evicting' its value by swapping it to its correct location and moving in the value at its destination. If you ever move in a value which is congruent to the value you just evicted, you have a duplicate and can exit early. Worst case, it's O(2n).

The check is a single pass through the array examining each value with it's next highest neighbor. Always O(n).

Combined algorithm is O(n)+O(2n) = O(3n) = O(n)

Pseudocode from my solution:
```
foreach(values[]) 
  while(values[i] not congruent to i)
    to-be-evicted = values[i]
    evict(values[i])   // swap to its 'proper' location
    if(values[i]%length == to-be-evicted%length)
      return false;  // a 'duplicate' arrived when we evicted that number
  end while
end foreach
foreach(values[])
  if((values[i]+1)%length != values[i+1]%length)
    return false
end foreach
```
I've included the java code proof of concept below, it's not pretty, but it passes all the unit tests I made for it. I call these a 'StraightArray' because they correspond to the poker hand of a straight (contiguous sequence ignoring suit).
```
public class StraightArray {    
    static int evict(int[] a, int i) {
        int t = a[i];
        a[i] = a[t%a.length];
        a[t%a.length] = t;
        return t;
    }
    static boolean isStraight(int[] values) {
        for(int i = 0; i < values.length; i++) {
            while(values[i]%values.length != i) {
                int evicted = evict(values, i);
                if(evicted%values.length == values[i]%values.length) {
                    return false;
                }
            }
        }
        for(int i = 0; i < values.length-1; i++) {
            int n = (values[i]%values.length)+1;
            int m = values[(i+1)]%values.length;
            if(n != m) {
                return false;
            }
        }
        return true;
    }
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
小蘑菇

2020-11-28 02:06
Product of m consecutive numbers is divisible by m! [ m factorial ]

so in one pass you can compute the product of the m numbers, also compute m! and see if the product modulo m ! is zero at the end of the pass

I might be missing something but this is what comes to my mind ...

something like this in python

my_list1 = [9,5,8,7,6]

my_list2 = [3,5,4,7]

def consecutive(my_list):
```
count = 0
prod = fact = 1
for num in my_list:
    prod *= num
    count +=1 
    fact *= count
if not prod % fact: 
    return 1   
else:   
    return 0 
```
print consecutive(my_list1)

print consecutive(my_list2)

HotPotato ~$ python m_consecutive.py

1

0
0 讨论(0)
发布评论:

提交评论
- 加载中...
轻奢々

2020-11-28 02:08
MY CURRENT BEST OPTION
```
def uniqueSet( array )
  check_index = 0; 
  check_value = 0; 
  min = array[0];
  array.each_with_index{ |value,index|
         check_index = check_index ^ ( 1 << index );
         check_value = check_value ^ ( 1 << value );
         min = value if value < min
  } 
  check_index =  check_index  << min;
  return check_index == check_value; 
end
```
O(n) and Space O(1)

I wrote a script to brute force combinations that could fail that and it didn't find any. If you have an array which contravenes this function do tell. :)

@J.F. Sebastian

Its not a true hashing algorithm. Technically, its a highly efficient packed boolean array of "seen" values.
```
ci = 0, cv = 0
[5,4,3]{ 
  i = 0 
  v = 5 
  1 << 0 == 000001
  1 << 5 == 100000
  0 ^ 000001  = 000001
  0 ^ 100000  = 100000

  i = 1
  v = 4 
  1 << 1 == 000010
  1 << 4 == 010000
  000001 ^ 000010  = 000011
  100000 ^ 010000  = 110000 

  i = 2
  v = 3 
  1 << 2 == 000100
  1 << 3 == 001000
  000011 ^ 000100  = 000111
  110000 ^ 001000  = 111000 
}
min = 3 
000111 << 3 == 111000
111000 === 111000
```
The point of this being mostly that in order to "fake" most the problem cases one uses duplicates to do so. In this system, XOR penalises you for using the same value twice and assumes you instead did it 0 times.

The caveats here being of course:
1. both input array length and maximum array value is limited by the maximum value for $x in ( 1 << $x > 0 )
2. ultimate effectiveness depends on how your underlying system implements the abilities to:
  1. shift 1 bit n places right.
  2. xor 2 registers. ( where 'registers' may, depending on implementation, span several registers )
  edit Noted, above statements seem confusing. Assuming a perfect machine, where an "integer" is a register with Infinite precision, which can still perform a ^ b in O(1) time.
But failing these assumptions, one has to start asking the algorithmic complexity of simple math.
- How complex is 1 == 1 ?, surely that should be O(1) every time right?.
- What about 2^32 == 2^32 .
- O(1)? 2^33 == 2^33? Now you've got a question of register size and the underlying implementation.
- Fortunately XOR and == can be done in parallel, so if one assumes infinite precision and a machine designed to cope with infinite precision, it is safe to assume XOR and == take constant time regardless of their value ( because its infinite width, it will have infinite 0 padding. Obviously this doesn't exist. But also, changing 000000 to 000100 is not increasing memory usage.
- Yet on some machines , ( 1 << 32 ) << 1 will consume more memory, but how much is uncertain.
0 讨论(0)
发布评论:

提交评论
- 加载中...
盖世英雄少女心

2020-11-28 02:09
Awhile back I heard about a very clever sorting algorithm from someone who worked for the phone company. They had to sort a massive number of phone numbers. After going through a bunch of different sort strategies, they finally hit on a very elegant solution: they just created a bit array and treated the offset into the bit array as the phone number. They then swept through their database with a single pass, changing the bit for each number to 1. After that, they swept through the bit array once, spitting out the phone numbers for entries that had the bit set high.

Along those lines, I believe that you can use the data in the array itself as a meta data structure to look for duplicates. Worst case, you could have a separate array, but I'm pretty sure you can use the input array if you don't mind a bit of swapping.

I'm going to leave out the n parameter for time being, b/c that just confuses things - adding in an index offset is pretty easy to do.

Consider:
```
for i = 0 to m
  if (a[a[i]]==a[i]) return false; // we have a duplicate
  while (a[a[i]] > a[i]) swapArrayIndexes(a[i], i)
  sum = sum + a[i]
next

if sum = (n+m-1)*m return true else return false
```
This isn't O(n) - probably closer to O(n Log n) - but it does provide for constant space and may provide a different vector of attack for the problem.

If we want O(n), then using an array of bytes and some bit operations will provide the duplication check with an extra n/32 bytes of memory used (assuming 32 bit ints, of course).

EDIT: The above algorithm could be improved further by adding the sum check to the inside of the loop, and check for:
```
if sum > (n+m-1)*m return false
```
that way it will fail fast.
0 讨论(0)
发布评论:

提交评论
- 加载中...

北荒

2020-11-28 02:09

A C version of Kent Fredric's Ruby solution

(to facilitate testing)

Counter-example (for C version): {8, 33, 27, 30, 9, 2, 35, 7, 26, 32, 2, 23, 0, 13, 1, 6, 31, 3, 28, 4, 5, 18, 12, 2, 9, 14, 17, 21, 19, 22, 15, 20, 24, 11, 10, 16, 25}. Here n=0, m=35. This sequence misses 34 and has two 2.

It is an O(m) in time and O(1) in space solution.

Out-of-range values are easily detected in O(n) in time and O(1) in space, therefore tests are concentrated on in-range (means all values are in the valid range [n, n+m)) sequences. Otherwise {1, 34} is a counter example (for C version, sizeof(int)==4, standard binary representation of numbers).

The main difference between C and Ruby version: << operator will rotate values in C due to a finite sizeof(int), but in Ruby numbers will grow to accomodate the result e.g.,

Ruby: 1 << 100 # -> 1267650600228229401496703205376

C: int n = 100; 1 << n // -> 16

In Ruby: check_index ^= 1 << i; is equivalent to check_index.setbit(i). The same effect could be implemented in C++: vector<bool> v(m); v[i] = true;

bool isperm_fredric(int m; int a[m], int m, int n)
{
  /**
     O(m) in time (single pass), O(1) in space,
     no restriction on n,
     ?overflow?
     a[] may be readonly
   */
  int check_index = 0;
  int check_value = 0;

  int min = a[0];
  for (int i = 0; i < m; ++i) {

    check_index ^= 1 << i;
    check_value ^= 1 << (a[i] - n); //

    if (a[i] < min)
      min = a[i];
  }
  check_index <<= min - n; // min and n may differ e.g., 
                           //  {1, 1}: min=1, but n may be 0.
  return check_index == check_value;
}

Values of the above function were tested against the following code:

bool *seen_isperm_trusted  = NULL;
bool isperm_trusted(int m; int a[m], int m, int n)
{
  /** O(m) in time, O(m) in space */

  for (int i = 0; i < m; ++i) // could be memset(s_i_t, 0, m*sizeof(*s_i_t));
    seen_isperm_trusted[i] = false;

  for (int i = 0; i < m; ++i) {

    if (a[i] < n or a[i] >= n + m)
      return false; // out of range

    if (seen_isperm_trusted[a[i]-n])
      return false; // duplicates
    else
      seen_isperm_trusted[a[i]-n] = true;
  }

  return true; // a[] is a permutation of the range: [n, n+m)
}

Input arrays are generated with:

void backtrack(int m; int a[m], int m, int nitems)
{
  /** generate all permutations with repetition for the range [0, m) */
  if (nitems == m) {
    (void)test_array(a, nitems, 0); // {0, 0}, {0, 1}, {1, 0}, {1, 1}
  }
  else for (int i = 0; i < m; ++i) {
      a[nitems] = i;
      backtrack(a, m, nitems + 1);
    }
}

0 讨论(0)

遇见更好的自我

2020-11-28 02:10

boolean determineContinuousArray(int *arr, int len)
{
    // Suppose the array is like below:
    //int arr[10] = {7,11,14,9,8,100,12,5,13,6};
    //int len = sizeof(arr)/sizeof(int);

    int n = arr[0];

    int *result = new int[len];
    for(int i=0; i< len; i++)
            result[i] = -1;
    for (int i=0; i < len; i++)
    {
            int cur = arr[i];
            int hold ;
            if ( arr[i] < n){
                    n = arr[i];
            }
            while(true){
                    if ( cur - n >= len){
                            cout << "array index out of range: meaning this is not a valid array" << endl;
                            return false;
                    }
                    else if ( result[cur - n] != cur){
                            hold = result[cur - n];
                            result[cur - n] = cur;
                            if (hold == -1) break;
                            cur = hold;

                    }else{
                            cout << "found duplicate number " << cur << endl;
                            return false;
                    }

            }
    }
    cout << "this is a valid array" << endl;
    for(int j=0 ; j< len; j++)
            cout << result[j] << "," ;
    cout << endl;
    return true;
}

0 讨论(0)

Algorithm to determine if array contains n…n+m?

Product of m consecutive numbers is divisible by m! [ m factorial ]

A C version of Kent Fredric's Ruby solution