What\'s the rationale behind the formula used in the hive_trend_mapper.py
program of this Hadoop tutorial on calculating Wikipedia trends?
There are actuall
The code implements statistics (in this case the "baseline trend"), you should educate yourself on that and everything becomes clearer. Wikibooks has a good instroduction.
The algorithm takes into account that new pages are by definition more unpopular than existing ones (because - for example - they are linked from relatively few other places) and suggests that those new pages will grow in popularity over time.
error
is the error margin the system expects for its prognoses. The higher error
is, the more unlikely the trend will continue as expected.