BigQuery - remove unused column from schema

后端 未结 3 770
囚心锁ツ
囚心锁ツ 2021-02-13 17:45

I accidentally added a wrong column to my BigQuery table schema.

Instead of reloading the complete table (million of rows), I would like to know if the following is poss

3条回答
  •  悲&欢浪女
    2021-02-13 18:30

    If your table does not consist of record/repeated type fields - your simple option is:

    1. Select valid columns while filtering out bad records into new temp table

      SELECT < list of original columns >
      FROM YourTable
      WHERE < filter to remove bad entries here >

      Write above to temp table - YourTable_Temp

    2. Make a backup copy of "broken" table - YourTable_Backup

    3. Delete YourTable
    4. Copy YourTable_Temp to YourTable
    5. Check if all looks as expected and if so - get rid of temp and backup tables

    Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free

    In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
    You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you

    Create a table with Record type column
    create a table with a column type RECORD

    I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself

提交回复
热议问题