How to append files in GCS with the same schema?

不问归期 提交于 2021-01-28 10:33:07

问题


Is there any way one can append two files in GCS, suppose file one is a full load and second file is an incremental load. Then what's the way we can append the two?
Secondly, using gsutil compose will append the two files including the attributes names as well. So, in the final file I want the data of the two files.


回答1:


You can append two separate files using compose in the Google Cloud Shell and rename the output file as the first file, like this:

gsutil compose gs://bucket/obj1 [gs://bucket/obj2 ...] gs://bucket/obj1

This command is meant for parallel uploads in which you divide a large object file in smaller objects. They get uploaded to Google Cloud Storage and then you can append them to get the original file. You can find more information on Composite Objects and Parallel Uploads.

I've come up with two possible solutions:

Google Cloud Function solution

The option I would go for is using a Cloud Function. Doing something like the following:

  1. Create an empty bucket like append_bucket.
  2. Upload the first file.
  3. Create a Cloud Function to be triggered by new uploaded files on the bucket.
  4. Upload the second file.
  5. Read the first and the second file (you will have to download them as string first).
  6. Make the append operation.
  7. Upload the result to the bucket.

Google Dataflow solution

You can also do it with Dataflow for BigQuery (keep in mind it’s still in beta).

  1. Create a BigQuery dataset and table.
  2. Create a Dataflow instance, from the template Cloud Storage Text to BigQuery.
  3. Create a Javascript file with the logic to transform the text.
  4. Upload your files in Json format to the bucket.
  5. Dataflow will read the Json file, execute the Javascript code and append the new data to the BigQuery dataset.
  6. At last, export the BigQuery query result to Cloud Storage.


来源:https://stackoverflow.com/questions/53487432/how-to-append-files-in-gcs-with-the-same-schema

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!