问题
Is there any way I can schedule a DAG to be triggered right after a google sheet is being updated?
Not sure if I get any answer from this doc : https://airflow.readthedocs.io/en/latest/_api/airflow/providers/google/suite/hooks/sheets/index.html
回答1:
@Alejandro's direction is right but just expanding on to his answer. You can use HttpSensor operator to do a get request to sheet file by google drive api
HttpSensor(
task_id='http_sensor_check',
http_conn_id='http_default',
endpoint='https://www.googleapis.com/drive/v3/files/fileId',
request_params={},
response_check=,
poke_interval=5,
dag=dag,
)
Now as per return response documentation, it should return modeifiedtime, which you can see in the response in response_check
response_check=lambda response: response.json()['modifiedTime'] > last_time_stored
You can replace this lambda and take value from your Db or cache etc.
Trigger right After:: Now you can use next operator in combination with this sensor to trigger conditionally.
Note: Here poke_Interval depend on the use case, how often you want to check for modification.
回答2:
You can use HTTPOperator along with Google drive API https://developers.google.com/drive/api/v3/reference/files/get
You can also write your own implementation see WebHDFSHook and WebHDFSSensor for reference
来源:https://stackoverflow.com/questions/64151674/airflow-trigger-dag-anytime-after-a-google-sheet-is-being-updated