Any feature in BigQuery that can migrate a whole dataset in another project w/o executing copy data?

前端 未结 4 1559
无人及你
无人及你 2021-02-08 16:04

While our project grows, at some point we realized that we need to create new projects and reorganize our dataset. One case is that we need to isolate one dataset from others in

相关标签:
4条回答
  • 2021-02-08 16:23

    There's no built-in feature but I helped write a tool that we've open-sourced that will do this for you: https://github.com/uswitch/big-replicate.

    It will let you synchronise/copy tables between projects or datasets (within the same project). Most of the details are in the project's README but for reference it looks a little like:

    java -cp big-replicate-standalone.jar \
      uswitch.big_replicate.sync \
      --source-project source-project-id \
      --source-dataset 98909919 \
      --destination-project destination-project-id \
      --destination-dataset 98909919
    

    You can set options that will control how many tables to copy, how many jobs run concurrently and where to store the intermediate data in Cloud Storage. The destination dataset must already exist but this means you'll be able to copy data between locations too (US, EU, Asia etc.).

    Binaries are built on CircleCI and published to GitHub releases.

    0 讨论(0)
  • 2021-02-08 16:29

    A short shell script which copies all tables from a dataset to another dataset:

    export SOURCE_DATASET=$1  # project1:dataset
    export DEST_PREFIX=$2  # project2:dataset2.any_prefix_
    for f in `bq ls $SOURCE_DATASET |grep TABLE | awk '{print $1}'`
    do
      export CP_COMMAND="bq cp $SOURCE_DATASET.$f $DEST_PREFIX$f"
      echo $CP_COMMAND
      echo `$CP_COMMAND`
    done
    
    0 讨论(0)
  • 2021-02-08 16:32

    Nope, there's currently no move or rename operation in BigQuery. The best way to move your data is to copy it and delete the original.

    Follow-up answer: Your batch request created the copy jobs, but you need to wait for them to complete and then observe the result. You can use the BigQuery web UI or run "bq ls -j" from the command line to see recent jobs.

    0 讨论(0)
  • 2021-02-08 16:32

    You can first copy BigQuery dataset to the new project, then delete the original dataset.

    The copy dataset UI is similar to copy table. Just click "copy dataset" button from the source dataset, and specify the destination dataset in the pop-up form. See screenshot below. Check out the public documentation for more use cases.

    Copy dataset button

    Copy dataset form

    0 讨论(0)
提交回复
热议问题