问题
Algorithm Challenge :
Problem statement : How would you design a logging system for something like Google , you should be able to query for the number of times a URL was opened within two time frames.
i/p : start_time , end_time , URL1 o/p : number of times URL1 was opened between start and end time.
Some specs : Database is not an optimal solution A URL might have been opened multiple times for given time stamp. A URL might have been opened a large number of times within two time stamps. start_time and end_time can be a month apart. time could be granular to a second.
回答1:
One solution :
Hash of a hash
Key Value URL Hash----> T1 CumFrequency
Eg :
Amazon Hash--> T CumFreq 11 00 am 3 ( opened 3 times at 11:00 am ) 11 15 am 4 ( opened 1 time at 11:15 am , cumfreq is 3+1=4) 11 30 am 11 ( opened 4 times at 11:30 am , cumfreq is 3+4+4=11) i/p : 11 : 10 am , 11 : 37 am , Amazon
the o.p can be obtained by subtracting , last timestamp less then 11:10 which 11:00 am , and last active time stamp less than 11:37 am which is 11:30 am. Hence the result is 11-3 = 8 ....
Can we do better ?
来源:https://stackoverflow.com/questions/14824189/information-retrieval-url-hits-in-a-time-frame