问题
I have a requirement to build a Data Studio dashboard and to use data from BigQuery dataset.
I have imported my data to BQ using Data Fusion from an on-premise MS SQL server, and the requirement is I have to delete the last 5 days of the records and import new updated records for the same time range on top of the records in the BQ dataset...
So far I was able to do all the work with the pipeline but when I run the pipeline it does append the data again into the BQ table and I end up with duplicate data.
I am looking to a way to do some manipulation to the data in BQ before it receives new data from the pipeline. Is there anything available in Data Fusion that can help with this?
Regards
回答1:
We recently added this functionality to the google-cloud plugins. You can check the changes here - Google-Cloud-Plugin PR#140. You can either wait for the newer version of google-cloud plugins to be released or you can build it locally and install the plugin in Data Fusion instance you are testing this.
Hope this helps.
来源:https://stackoverflow.com/questions/57697074/possible-to-modify-or-delete-rows-from-a-table-in-bigquery-dataset-with-a-cloud