Airflow BigQueryOperator: how to save query result in a partitioned Table?

后端 未结 4 2021
长情又很酷
长情又很酷 2021-01-01 00:23

I have a simple DAG

from airflow import DAG
from airflow.contrib.operators.bigquery_operator import BigQueryOperator

with DAG(dag_id=\'my_dags.my_dag\') as         


        
4条回答
  •  孤城傲影
    2021-01-01 01:08

    You first need to create an Empty partitioned destination table. Follow instructions here: link to create an empty partitioned table

    and then run below airflow pipeline again. You can try code:

    import datetime
    from airflow import DAG
    from airflow.contrib.operators.bigquery_operator import BigQueryOperator
    today_date = datetime.datetime.now().strftime("%Y%m%d")
    table_name = 'my_dataset.my_table' + '$' + today_date
    with DAG(dag_id='my_dags.my_dag') as dag:
        start = DummyOperator(task_id='start')
        end = DummyOperator(task_id='end')
        sql = """
             SELECT *
             FROM 'another_dataset.another_table'
              """
        bq_query = BigQueryOperator(bql=sql,
                            destination_dataset_table={{ params.t_name }}),
                            task_id='bq_query',
                            bigquery_conn_id='my_bq_connection',
                            use_legacy_sql=False,
                            write_disposition='WRITE_TRUNCATE',
                            create_disposition='CREATE_IF_NEEDED',
                            query_params={'t_name': table_name},
                            dag=dag
                            )
    start >> bq_query >> end
    

    So what I did is that I created a dynamic table name variable and passed to the BQ operator.

提交回复
热议问题