An interview question: About Probability

前端未结

关注

 10  1475

清酒与你 2021-01-29 21:14

An interview question:

Given a function f(x) that 1/4 times returns 0, 3/4 times returns 1. Write a function g(x) using f(x) that 1/2 times returns 0, 1/2 times returns

10条回答

天涯浪人 (楼主)

2021-01-29 21:31
A refinement of the same approach used in btilly's answer, achieving an average ~1.85 calls to f() per g() result (further refinement documented below achieves ~1.75, tbilly's ~2.6, Jim Lewis's accepted answer ~5.33). Code appears lower in the answer.

Basically, I generate random integers in the range 0 to 3 with even probability: the caller can then test bit 0 for the first 50/50 value, and bit 1 for a second. Reason: the f() probabilities of 1/4 and 3/4 map onto quarters much more cleanly than halves.

Description of algorithm

btilly explained the algorithm, but I'll do so in my own way too...

The algorithm basically generates a random real number x between 0 and 1, then returns a result depending on which "result bucket" that number falls in:
```
result bucket      result
         x < 0.25     0
 0.25 <= x < 0.5      1
 0.5  <= x < 0.75     2
 0.75 <= x            3
```
But, generating a random real number given only f() is difficult. We have to start with the knowledge that our x value should be in the range 0..1 - which we'll call our initial "possible x" space. We then hone in on an actual value for x:
- each time we call f():
  - if f() returns 0 (probability 1 in 4), we consider x to be in the lower quarter of the "possible x" space, and eliminate the upper three quarters from that space
  - if f() returns 1 (probability 3 in 4), we consider x to be in the upper three-quarters of the "possible x" space, and eliminate the lower quarter from that space
  - when the "possible x" space is completely contained by a single result bucket, that means we've narrowed x down to the point where we know which result value it should map to and have no need to get a more specific value for x.
It may or may not help to consider this diagram :-):
```
    "result bucket" cut-offs 0,.25,.5,.75,1

    0=========0.25=========0.5==========0.75=========1 "possible x" 0..1
    |           |           .             .          | f() chooses x < vs >= 0.25
    |  result 0 |------0.4375-------------+----------| "possible x" .25..1
    |           | result 1| .             .          | f() chooses x < vs >= 0.4375
    |           |         | .  ~0.58      .          | "possible x" .4375..1
    |           |         | .    |        .          | f() chooses < vs >= ~.58
    |           |         ||.    |    |   .          | 4 distinct "possible x" ranges
```
Code
```
int g() // return 0, 1, 2, or 3                                                 
{                                                                               
    if (f() == 0) return 0;                                                     
    if (f() == 0) return 1;                                                     
    double low = 0.25 + 0.25 * (1.0 - 0.25);                                    
    double high = 1.0;                                                          

    while (true)                                                                
    {                                                                           
        double cutoff = low + 0.25 * (high - low);                              
        if (f() == 0)                                                           
            high = cutoff;                                                      
        else                                                                    
            low = cutoff;                                                       

        if (high < 0.50) return 1;                                              
        if (low >= 0.75) return 3;                                              
        if (low >= 0.50 && high < 0.75) return 2;                               
    }                                                                           
}
```
If helpful, an intermediary to feed out 50/50 results one at a time:
```
int h()
{
    static int i;
    if (!i)
    {
        int x = g();
        i = x | 4;
        return x & 1;
    }
    else
    {
        int x = i & 2;
        i = 0;
        return x ? 1 : 0;
    }
}
```
NOTE: This can be further tweaked by having the algorithm switch from considering an f()==0 result to hone in on the lower quarter, to having it hone in on the upper quarter instead, based on which on average resolves to a result bucket more quickly. Superficially, this seemed useful on the third call to f() when an upper-quarter result would indicate an immediate result of 3, while a lower-quarter result still spans probability point 0.5 and hence results 1 and 2. When I tried it, the results were actually worse. A more complex tuning was needed to see actual benefits, and I ended up writing a brute-force comparison of lower vs upper cutoff for second through eleventh calls to g(). The best result I found was an average of ~1.75, resulting from the 1st, 2nd, 5th and 8th calls to g() seeking low (i.e. setting low = cutoff).
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...

An interview question: About Probability

Description of algorithm

Code