Context: I\'m not necessarily referring to a KCL-based application, just pure Kinesis API calls.
Does the using the TRIM_HORIZON
shard iter
it's at the TRIM HORIZON, or the HORIZON where the stream TRIMming happens.
the shard iterator may get 0 records when called, so you'll need to keep iterating to reach the area where the oldest record is (if you push infrequently to the stream or have time gaps). the getRecords will give you the next shard iterator you can use to iterate.
from doc: http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html
If there are no records available in the portion of the shard that the iterator points to, GetRecords returns an empty list. Note that it might take multiple calls to get to a portion of the shard that contains records.
TRIM_HORIZON gives the oldest record in the stream.
Just that sometimes on giving TRIM_HORIZON as the shard_iterator_type :-
Suppose the value of "millis_behind_latest" in the kinesis response is ~86399000 & your stream retention period is 24 hours(86400000)
By the time you use the shard_iterator to retrieve the record, the record is no longer in the stream as the retention period of the record has been exceeded. Hence you get an empty result because the oldest record has expired and no longer there in the data stream. So the shard_iterator is now pointing to an empty space in the disk.
When such a thing happens take the value of "next_shard_iterator" and use get_records to once again get the kinesis data records.
Also another thing is we do not completely know how AWS manages each shard in the data stream. How data is erased and added into it. Maybe data is not stored in concurrent/contiguous memory memory blocks and hence we get empty results in between retrieval of data.
Keep taking the value of "next_shard_iterator" and use get_records until you get a value of 0 for "millis_behind_latest".
Hope this answer helps. :)