Automate file upload from Google Cloud Storage to Bigquery

后端 未结 2 1242
暖寄归人
暖寄归人 2021-01-13 10:56

We have an automated FTP process set up which imports a data file into Google Cloud Storage daily.

I would like to set up a daily automated job that uploads this cs

相关标签:
2条回答
  • 2021-01-13 11:15

    Background Cloud Function with a Cloud Storage trigger is your best choice!

    You can set it to monitor specific bucket for new files and execute load script whenever trigger is fired

    Forgot to mention - Cloud Functions support (as of now) only node.js for scripting - which usually not a problem but just wanted to mention :o)

    0 讨论(0)
  • 2021-01-13 11:17

    Similar approach to Mikhail's answer - but with more code:

    With Google Cloud Functions you can automate BigQuery each time you receive a new file:

    • Create a new function at https://console.cloud.google.com/functions/add
    • Point "bucket" to the one receiving files.

    Codewise, import BigQuery inside package.json:

    {
      "dependencies": {
        "@google-cloud/bigquery": "^0.9.6"
      } 
    }
    

    And on index.js you can act on the new file in any appropriate way:

    var bigQuery = BigQuery({ projectId: 'your-project-id' });
    
    exports.processFile = (event, callback) => {
      console.log('Processing: ' + JSON.stringify(event.data));
      query(event.data);
      callback();
    };
    
    var BigQuery = require('@google-cloud/bigquery');
    
    function query(data) {
        const filename = data.name.split('/').pop();
        const full_filename = `gs://${data.bucket}/${data.name}`
    
        // if you want to run a query:
        query = '...'
        bigQuery.query({
            query: query,
            useLegacySql: false
        });
    };
    
    0 讨论(0)
提交回复
热议问题