I was under the impression that combiners are just like reducers that act on the local map task, That is it aggregates the results of individual Map task in order to reduce
The main function of a combiner
is optimization. It acts like a mini-reducer for most cases. From page 206 of the same book, chapter - How mapreduce works(The map side):
Running the combiner function makes for a more compact map output, so there is less data to write to local disk and to transfer to the reducer.
The quote from your question,
If a combiner is specified it will be run during the merge to reduce the amount of data written to disk.
Both the quotes indicate that a combiner
is run primarily for compactness. Reducing the network bandwidth for output transfer is an advantage of this optimization.
Also, from the same book,
Recall that combiners may be run repeatedly over the input without affecting the final result. If there are only one or two spills, then the potential reduction in map output size is not worth the overhead in invoking the combiner, so it is not run again for this map output.
Meaning that hadoop doesn't guarentee how many times a combiner is run(could be zero also)
A combiner is never run for map-only jobs. It makes sense because, a combiner changes the map output. Also, since it doesn't guarantee the number of times it is called, the map output is not guaranteed to be the same either.