I was recently asked to create a program to find best matches in text fragment. I have successfully written this program but I do have a question about its time complexity.
No, you can't. According to the Big-O notation your function m
is an upper bound on the actual time your algorithm takes to run, if there's a constant M
such as the real time will always be less or equals to M*m
. Take a case where the document has size zero (an empty document) but someone queries it with a positive number of characters. The upper bound in this case will be 0
(plus a constant), but the actual time the program will take to run might be greater than that. So your program can not be said to be O(m)
.
In other words, "most cases" isn't enough: you must prove that your algorithm will perform within that upper bound in all cases.
Update: The same can be said for p
: common sense says p
is always smaller than m
, but that's only true if the search terms don't overlap. Take for instance the document aaaaaa
(m=6) and the search terms a
, aa
and aaa
(n=3). In this case, there are 6 occurences of a
, 5 of aa
and 4 of aaa
, so p = 15
. Even though it's a very unlikely scenario (same for the empty document) it's still required that you take p
into account in your complexity analysis. So your program must really be described as O(m + n + p)
as you originally stated.