Sorted Table in Hive (ORC file format)
问题 I'm having some difficulties to make sure I'm leveraging sorted data within a Hive table. (Using ORC file format) I understand we can affect how the data is read from a Hive table, by declaring a DISTRIBUTE BY clause in the create DDL. CREATE TABLE trades ( trade_id INT, name STRING, contract_type STRING, ts INT ) PARTITIONED BY (dt STRING) CLUSTERED BY (trade_id) SORTED BY (trade_id, time) INTO 8 BUCKETS STORED AS ORC; This will mean that every time I make a query to this table, the data