So for example, the answer for the array:
1, 11, 3, 95, 23, 8, 1
would be 1, since all the other elements only occur once while 1 occurs twice.
A lo
If you want to have fixed space to find the most common element you need to have a maximum number of bits for an element. If you didn't, then large input arrays could have larger input numbers such that the bits to represent the number is bigger than your fixed space to store the result.
Suppose k
is the length of the largest number you support. If you try to naively create an array of 2^k
buckets to count occurrences of each number (counter sort) you could receive an array consisting of the same number, in which case your algorithm would end up needing log(n)
space to store the sum.[*]
If we look at a simpler version of the problem - determine whether or not there are more 1
's or 0
's in an input, I think you need a stack to do this (you store how much 1
or 0
is leading by), and so constant space just isn't possible, even if we limit the input length to k = 1
bit in size.
Your problem is more general (k > 1
, but still fixed), and would also need non-constant space, so it's not possible, as the question is worded.
[*] If you assume counters have O(1)
space complexity, then you can take the counter sort approach, although by doing so you've placed an upper-bound on the maximum size of your input array (which may or may not be acceptable): In terms of k
, the maximum number of bits for an input element of your array and in terms of c
the maximum number of bits in your counter your array can have at most 2^k * 2^c
elements (one of the counters would overflow otherwise on the next element). To address this, you could add a O(1)
time step to decrement your counters so that the minimum value is always 0
after each element is processed if all counters are non-0
, thereby making them relative instead of absolute. This takes O(1)
time because if all are non-zero you only need to decrement O(2^k) = O(1)
counters by 1
if you perform it on each element. While the algorithm can now process some arbitrarily large inputs, any input array that has a sub-array such that two values a
and b
are such that count(a) - count(b) > 2^c = max(counter)
using a counter strategy will fail for some inputs. In fact a consequence of relying on a O(1)
space complexity counter approach is that all arrays that start with 2^c + 1
identical elements cannot be handled by this algorithm.
This is not a complete answer, but it should help shed some light on why this problem is difficult.
Consider we want to design an algorithm, that does one sweep over the array (in some order) to find the most common element. During the run of our algorithm, it is allowed to keep some data structure S
. Let's see how much information there has to be in S
, and thus if we can contain it in O(1)
memory.
Say our algorithm has processed the first k
elements of the array. Now S
can tell us the most common element in the range a[0..k]
. However, say we knew the k+1
'st element, then we would also know the most common element in the range a[0..k+1]
. If it couldn't, our algorithm wouldn't work if n
was k+1
. More generally, given knowledge of elements a[k..m]
and S
, we know the most common element in a[0..m]
.
We can use the above argument to extract information from S
. Say we are working with integers in the range [0,u]
(there has to be some range if the original array took space O(n)
). If the original most common element is 5
, then we add 0
's until the most common element changes. If that took c
zeroes, a[0..k]
must have contained c
more 5
's than 0
's. Repeating this argument we get a lot of linear equations which we can solve to tell exactly how many times each of the elements [0,u]
were present in a[0..k]
.
This tells us that any data structure that does a sweep, might as well store the counts of all the seen elements (in some compressed way). If you're interested in the maths, the stored after seeing n
numbers is log(n+u-1 choose n)
which is the log of the number of ways to partition n
indistinguishable items into u
distinguishable bins. That's more than log(u^n/n!) >= nlogu-nlogn
.
Conclusion: Any algorithm that does only one pass of the array will have to use as much memory as it takes to store all the counts seen so far. If n
is small compared to u
this corresponds to storing n
words of memory.
(Well, instead of extra memory we might also overwrite the existing array).
There's a lot more to explore here. E.g. how multiple passes affect the above arguments. However I think I should stop at this point :), but it doesn't seem likely to me that any linear time algorithm, with a large u
, will be able to get away with O(1)
extra memory.
this is my script to read most common element in an array
<?php
class TestClass {
public $keyVal;
public $keyPlace = 0;
//put your code here
public function maxused_num($array) {
$temp = array();
$tempval = array();
$r = 0;
for ($i = 0; $i <= count($array) - 1; $i++) {
$r = 0;
for ($j = 0; $j <= count($array) - 1; $j++) {
if ($array[$i] == $array[$j]) {
$r = $r + 1;
}
}
$tempval[$i] = $r;
$temp[$i] = $array[$i];
}
//fetch max value
$max = 0;
for ($i = 0; $i <= count($tempval) - 1; $i++) {
if ($tempval[$i] > $max) {
$max = $tempval[$i];
}
}
//get value
for ($i = 0; $i <= count($tempval) - 1; $i++) {
if ($tempval[$i] == $max) {
$this->keyVal = $tempval[$i];
$this->keyPlace = $i;
break;
}
}
// 1.place holder on array $this->keyPlace;
// 2.number of reapeats $this->keyVal;
return $array[$this->keyPlace];
}
}
$catch = new TestClass();
$array = array(1, 1, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 1, 2, 3, 1, 1, 2, 5, 7, 1, 9, 0, 11, 22, 1, 1, 22, 22, 35, 66, 1, 1, 1);
echo $catch->maxused_num($array);
Use the idea from here:
How can we find a repeated number in array in O(n) time and O(1) space complexity
And apply a technique similar to counting sort. That is, create N bins (an array of size N), where N is the largest integer you expect to encounter. This is still O(1) space. Then, iterate through the original array in O(n) time, and when you encounter a value i, increment your results array at index i by 1. Then, iterate through the results array (again O(1) time), finding the largest single value. The index of that value will be the most common value in the original list.