How to add an integer unique id to query results - __efficiently__?

后端 未结 4 406
面向向阳花
面向向阳花 2021-01-25 17:08

Given a query, select * from ... (that might be part of CTAS statement)

The goal is to add an additional column, ID, where ID is a

4条回答
  •  南方客
    南方客 (楼主)
    2021-01-25 17:45

    Check this solution from Manoj Kumar: https://github.com/manojkumarvohra/hive-hilo

    • A stateful UDF is created which maintains a HI/LO counters to increment the sequences.
    • The HI value is maintained as distribute atomic long in zookeeper.
    • The HI value is incremented & fetched for every n LO (default 200) iterations.
    • The UDF supports a single String argument which is the sequence name used to maintain zNodes in zookeeper.

    Usage:

    FunctionName( sequenceName, lowvalue[optional], seedvalue[optional])
    

提交回复
热议问题