How to find the number of commits and current offset in each partition of a known kafka topic. I am using kafka v0.8.1.1
It is not clear from your question, what kind of offset you're interested in. There are actually three types of offsets:
In addition to command line utility, the offset information for #1 and #2 is also available via SimpleConsumer.earliestOrLatestOffset().
If the number of messages is not too large, you can specify a large --offsets parameter to GetOffsetShell and then count number of lines returned by the tool. Otherwise, you can write a simple loop in scala/java that would iterate all available offsets starting from the earliest.
From Kafka documentation:
Get Offset Shell
get offsets for a topic
bin/kafka-run-class.sh kafka.tools.GetOffsetShell
required argument [broker-list], [topic]
Option Description
------ -----------
--broker-list port of the server to connect to.
--max-wait-ms The max amount of time each fetch request waits. (default: 1000)
--offsets number of offsets returned (default: 1)
--partitions comma separated list of partition ids. If not specified, will find offsets for all partitions (default)
--time
--topic REQUIRED: The topic to get offsets from.