How to receive the recent data only in event hub

不打扰是莪最后的温柔 提交于 2019-12-11 14:59:05

问题


In eventhub, I have both "sender" and "receiver" scripts for communication between those two.

The issue that I am facing is that it seems that I am receiving a dataset that I sent yesterday plus the one that I just sent together. I am trying to control the data amount by either time period or the number of events.

The basic code for sender.py is following:


CONSUMER_GROUP = "$default"
OFFSET = Offset("-1")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=0, offset=OFFSET)
    client.run()
    start_time = time.time()
    batch = receiver.receive(timeout=100)

    for event_data in batch[-10:]:
        print("Received: {}".format(event_data.body_as_str(encoding='UTF-8')))
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()


回答1:


I just found a solution which uses the offset to control the read process of event data.

What we need to do first is that get the offset of the event data.

the code like below:

logger = logging.getLogger("azure")

ADDRESS = "amqps://xxx.servicebus.windows.net/xxx"
USER = "RootManageSharedAccessKey"
KEY = "xxx"

CONSUMER_GROUP = "$default"

#first, set offset to -1 to read all the event data
OFFSET = Offset("-1")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=5000, offset=OFFSET)
    client.run()
    start_time = time.time()
    print("**begin receive**")
    for event_data in receiver.receive(timeout=100):
        last_offset = event_data.offset.value
        last_sn = event_data.sequence_number
        #here, we print out the offset of each event data
        print("Received: {}, last_offset: {}, last_sn: {}".format(event_data.body_as_str(encoding='UTF-8'),last_offset,last_sn))        
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()

after executing, you can see all the offset of each data, screenshot like below:

then, you know the offset of each event data. And if you want to get the data from number 40 to number 53. The offset for number 40 is 237080, so in your code, change the offset to a value just less than 237080, set it to 237079 in this line of code OFFSET = Offset("237079").

The code like below:

logger = logging.getLogger("azure")

ADDRESS = "amqps://xxx.servicebus.windows.net/xx"
USER = "RootManageSharedAccessKey"
KEY = "xxx"

CONSUMER_GROUP = "$default"

#set the offset
OFFSET = Offset("237079")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
try:
    receiver = client.add_receiver(
        CONSUMER_GROUP, PARTITION, prefetch=5000, offset=OFFSET)
    client.run()
    start_time = time.time()
    print("**begin receive**")
    for event_data in receiver.receive(timeout=100):
        last_offset = event_data.offset.value
        last_sn = event_data.sequence_number
        print("Received: {}, last_offset: {}, last_sn: {}".format(event_data.body_as_str(encoding='UTF-8'),last_offset,last_sn))        
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()

after execute the code, only the event data from the specified offset are returned. Screenshot as below:



来源:https://stackoverflow.com/questions/58497804/how-to-receive-the-recent-data-only-in-event-hub

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!