What is the purpose of $CONDITIONS under --query?

有些话、适合烂在心里 提交于 2020-01-04 05:51:23

问题


I am using cloudera quick start edition CDH 5.7

I used below query on terminal window:

sqoop import \
  --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
  --username=retail_dba \
  --password=cloudera \
  --query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where \$CONDITIONS" \
  --target-dir /user/cloudera/order_join \
  --split-by order_id \
  --num-mappers 4

Q: What is the purpose of the $CONDITIONS ? Why used in this query ? Can anybody can explain to me.


回答1:


$CONDITIONS is used internally by sqoop to modify query to achieve task splitting and fetching metadata.

To fetch metadata, sqoop replaces \$CONDITIONS with 1= 0

select * from table where 1 = 0

To fetch all data (1 mapper), sqoop replaces \$CONDITIONS with 1= 1

select * from table where 1 = 1

In the case of multiple mappers, sqoop replaces \$CONDITIONS with range query to fetch a subset of data from RDBMS.

For example, id lies between 1 to 100 and we are using 4 mappers.

Select * From table WHERE id >= 1' AND 'id < 25
Select * From table WHERE id >= 25' AND 'id < 50
Select * From table WHERE id >= 50' AND 'id < 75
Select * From table WHERE id >= 75' AND 'id <= 100


来源:https://stackoverflow.com/questions/42330986/what-is-the-purpose-of-conditions-under-query

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!