问题
I have a table which contains 251M records and size is 2.5gb. I created a partition on two columns which I am doing condition in predicate. But the explain plan is not showing it is reading partition even though I have partitioned. With selecting from partition column, I am inserting to another table.
Is there a particular order I have to mention the condition in predicate ? How should I improve performance.
explain
SELECT
'123' AS run_session_id
, tbl1.transaction_id
, tbl1.src_transaction_id
, tbl1.transaction_created_epoch_time
, tbl1.currency
, tbl1.event_type
, tbl1.event_sub_type
, tbl1.estimated_total_cost
, tbl1.actual_total_cost
, tbl1.tfc_export_created_epoch_time
, tbl1.authorizer
, tbl1.acquirer
, tbl1.processor
, tbl1.company_code
, tbl1.country_of_account
, tbl1.merchant_id
, tbl1.client_id
, tbl1.ft_id
, tbl1.transaction_created_date
, tbl1.event_pst_time
, tbl1.extract_id_seq
, tbl1.src_type
, ROW_NUMBER() OVER(PARTITION by tbl1.transaction_id ORDER BY tbl1.event_pst_time DESC) AS seq_num -- while writing back to the pfit events table, write each event so that event_pst_time populates in right way
FROM db.xx_events tbl1 --<hiveFinalDB>-- -- DB variables wont work, so need to change the DB accrodingly for testing and PROD deployment
WHERE id_seq >= 215
AND id_seq <= 275
AND vent in('SPT','PNR','PNE','PNER','ACT','NTE');
Now how do I improve performance. The partition column is (id_seq,vent)
回答1:
explain dependency select ...
Demo
create table mytable (i int)
partitioned by (dt date)
;
alter table mytable add
partition (dt=date '2017-06-18')
partition (dt=date '2017-06-19')
partition (dt=date '2017-06-20')
partition (dt=date '2017-06-21')
partition (dt=date '2017-06-22')
;
explain dependency
select *
from mytable
where dt >= date '2017-06-20'
;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"input_tables":[{"tablename":"local_db@mytable","tabletype":"MANAGED_TABLE"}],"input_partitions":[{"partitionName":"local_db@mytable@dt=2017-06-20"},{"partitionName":"local_db@mytable@dt=2017-06-21"},{"partitionName":"local_db@mytable@dt=2017-06-22"}]} |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
"input_partitions":[
{"partitionName":"local_db@mytable@dt=2017-06-20"},{"partitionName":"local_db@mytable@dt=2017-06-21"},{"partitionName":"local_db@mytable@dt=2017-06-22"}]}
来源:https://stackoverflow.com/questions/44712384/hive-explain-plan-not-showing-partition