Max number of tuple replays on Storm Kafka Spout

前端 未结 5 1398
别跟我提以往
别跟我提以往 2021-01-17 17:34

We’re using Storm with the Kafka Spout. When we fail messages, we’d like to replay them, but in some cases bad data or code errors will cause messages to always fail a Bolt,

5条回答
  •  爱一瞬间的悲伤
    2021-01-17 18:14

    Basically it works like this:

    1. If you deploy topologies they should be production grade (this is, a certain level of quality is expected, and the number of tuples low).
    2. If a tuple fails, check if the tuple is actually valid.
    3. If a tuple is valid (for example failed to be inserted because it's not possible to connect to an external database, or something like this) reply it.
    4. If a tuple is miss-formed and can never be handled (for example an database id which is text and the database is expecting an integer) it should be ack, you will never be able to fix such thing or insert it into the database.
    5. New kinds of exceptions, should be logged (as well as the tuple contents itself). You should check these logs and generate the rule to validate tuples in the future. And eventually add code to correctly process them (ETL) in the future.
    6. Don't log everything, otherwise your log files will be huge, be very selective on what do you log. The contents of the log files should be useful and not a pile of rubbish.
    7. Keep doing this, and eventually you will only cover all cases.

提交回复
热议问题