Yesterday I was pairing the socks from the clean laundry and figured out the way I was doing it is not very efficient. I was doing a naive search — picking one sock and
From your question it is clear you don't have much actual experience with laundry :). You need an algorithm that works well with a small number of non-pairable socks.
The answers till now don't make good use of our human pattern recognition capabilities. The game of Set provides a clue of how to do this well: put all socks in a two-dimensional space so you can both recognize them well and easily reach them with your hands. This limits you to an area of about 120 * 80 cm or so. From there select the pairs you recognize and remove them. Put extra socks in the free space and repeat. If you wash for people with easily recognizable socks (small kids come to mind), you can do a radix sort by selecting those socks first. This algorithm works well only when the number of single socks is low
When I sort socks, I do an approximate radix sort, dropping socks near other socks of the same colour/pattern type. Except in the case when I can see an exact match at/near the location I'm about to drop the sock I extract the pair at that point.
Almost all the other algorithms (including the top scoring answer by usr) sort, then remove pairs. I find that, as a human, it is better to minimize the number of socks being considered at one time.
I do this by:
This takes advantage of the human ability to fuzzy-match in O(1) time, which is somewhat equivalent to the establishment of a hash-map on a computing device.
By pulling the distinctive socks first, you leave space to "zoom" in on the features which are less distinctive, to begin with.
After eliminating the fluro coloured, the socks with stripes, and the three pairs of long socks, you might end up with mostly white socks roughly sorted by how worn they are.
At some point, the differences between socks are small enough that other people won't notice the difference, and any further matching effort is not needed.
My solution does not exactly correspond to your requirements, as it formally requires O(n)
"extra" space. However, considering my conditions it is very efficient in my practical application. Thus I think it should be interesting.
The special condition in my case is that I don't use drying machine, just hang my cloths on an ordinary cloth dryer. Hanging cloths requires O(n)
operations (by the way, I always consider bin packing problem here) and the problem by its nature requires the linear "extra" space. When I take a new sock from the bucket I to try hang it next to its pair if the pair is already hung. If its a sock from a new pair I leave some space next to it.
It obviously requires some extra work to check if there is the matching sock already hanging somewhere and it would render solution O(n^2)
with coefficient about 1/2
for a computer. But in this case the "human factor" is actually an advantage -- I usually can very quickly (almost O(1)
) identify the matching sock if it was already hung (probably some imperceptible in-brain caching is involved) -- consider it a kind of limited "oracle" as in Oracle Machine ;-) We, the humans have these advantages over digital machines in some cases ;-)
O(n)
!Thus connecting the problem of pairing socks with the problem of hanging cloths I get O(n)
"extra space" for free, and have a solution that is about O(n)
in time, requires just a little more work than simple hanging cloths and allows to immediately access complete pair of socks even in a very bad Monday morning... ;-)
Case 1: All socks are identical (this is what I do in real life by the way).
Pick any two of them to make a pair. Constant time.
Case 2: There are a constant number of combinations (ownership, color, size, texture, etc.).
Use radix sort. This is only linear time since comparison is not required.
Case 3: The number of combinations is not known in advance (general case).
We have to do comparison to check whether two socks come in pair. Pick one of the O(n log n)
comparison-based sorting algorithms.
However in real life when the number of socks is relatively small (constant), these theoretically optimal algorithms wouldn't work well. It might take even more time than sequential search, which theoretically requires quadratic time.
In order to say how efficient it is to pair socks from a pile, we have to define the machine first, because the pairing isn't done whether by a turing nor by a random access machine, which are normally used as the basis for an algorithmic analysis.
The machine is an abstraction of a the real world element called human being. It is able to read from the environment via a pair of eyes. And our machine model is able to manipulate the environment by using 2 arms. Logical and arithmetic operations are calculated using our brain (hopefully ;-)).
We also have to consider the intrinsic runtime of the atomic operations that can be carried out with these instruments. Due to physical constraints, operations which are carried out by an arm or eye have non constant time complexity. This is because we can't move an endlessly large pile of socks with an arm nor can an eye see the top sock on an endlessly large pile of socks.
However mechanical physics give us some goodies as well. We are not limited to move at most one sock with an arm. We can move a whole couple of them at once.
So depending on the previous analysis following operations should be used in descending order:
We can also make use of the fact that people only have a very limited amount of socks. So an environmental modification can involve all socks in the pile.
So here is my suggestion:
Operation 4 is necessary, because when spreading socks over the floor some socks may hide others. Here is the analysis of the algorithm:
The algorithm terminates with high probability. This is due to the fact that one is unable to find pairs of socks in step number 2.
For the following runtime analysis of pairing n
pairs of socks, we suppose that at least half of the 2n
socks aren't hidden after step 1. So in the average case we can find n/2
pairs. This means that the loop is step 4 is executed O(log n)
times. Step 2 is executed O(n^2)
times. So we can conclude:
O(ln n + n)
environmental modifications (step 1 O(ln n)
plus picking every pair of sock from the floor)O(n^2)
environmental reads from step 2O(n^2)
logical and arithmetic operations for comparing a sock with another in step 2So we have a total runtime complexity of O(r*n^2 + w*(ln n + n))
where r
and w
are the factors for environmental read and environmental write operations respectively for a reasonable amount of socks. The cost of the logical and arithmetical operations are omitted, because we suppose that it takes a constant amount of logical and arithmetical operations to decide whether 2 socks belong to the same pair. This may not be feasible in every scenario.
I came out with another solution which would not promise fewer operations, neither less time consumption, but it should be tried to see if it can be a good-enough heuristic to provide less time consumption in huge series of sock pairing.
Preconditions: There is no guarantee that there are the same socks. If they are of the same color it doesn't mean they have the same size or pattern. Socks are randomly shuffled. There can be odd number of socks (some are missing, we don't know how many). Prepare to remember a variable "index" and set it to 0.
The result will have one or two piles: 1. "matched" and 2. "missing"
Heuristic:
Also, there could be added check for damaged socks also, as if the removal of those. It could be inserted between 2 and 3, and between 13 and 14.
I'm looking forward to hear about any experiences or corrections.