BigQuery - remove unused column from schema

后端未结

关注

 3  781

囚心锁ツ

I accidentally added a wrong column to my BigQuery table schema.

Instead of reloading the complete table (million of rows), I would like to know if the following is poss

相关标签:

3条回答

鱼传尺愫

2021-02-13 18:16
Below is the code to do it. Lets say c is the column that you wants to delete.
```
CREATE OR REPLACE TABLE transactions.test_table AS
SELECT * EXCEPT (c) FROM  transactions.test_table;
```
Or second method and my favorite is by following below steps.
1. Write Select query with the columns you want to exclude.
2. Go to Query Settings Query Settings
3. In Destination setting Set destination table for query results, enter project name, Dataset name and table name exactly same as you entered in Step 1.
4. In Destination table write preference select Overwrite table. Destination table settings
5. Save the Query Setting and run the query.
0 讨论(0)
发布评论:

提交评论
- 加载中...
小鲜肉

2021-02-13 18:19

Save results to table is your way to go. Try on the big table with the selected columns you are interested, and you can apply a limit to make it small.

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2021-02-13 18:30
If your table does not consist of record/repeated type fields - your simple option is:
1. Select valid columns while filtering out bad records into new temp table
  
  SELECT < list of original columns >
  FROM YourTable
  WHERE < filter to remove bad entries here >
  
  Write above to temp table - YourTable_Temp
2. Make a backup copy of "broken" table - YourTable_Backup
3. Delete YourTable
4. Copy YourTable_Temp to YourTable
5. Check if all looks as expected and if so - get rid of temp and backup tables
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free

In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you

Create a table with Record type column
create a table with a column type RECORD

I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself
0 讨论(0)
发布评论:

提交评论
- 加载中...