Using external data sources in BQ with specific generation from Google Storage

梦想的初衷 提交于 2021-01-28 05:40:30

问题


I want to use external data sources in a BQ select statement with not the latest but a specific generation of a file from Google Cloud Storage.

I currently use the following:

val sourceFile = "gs://test-bucket/flights.csv"
val queryConfig = QueryJobConfiguration.newBuilder(query)
                .addTableDefinition("tmpTable",
                        ExternalTableDefinition.newBuilder(sourceFile, schema, format)
                                .setCompression("GZIP")
                                .build())
                .build();
        bigQuery.query(queryConfig)

I tried to set the sourceFile variable as follows:

val sourceFile = "gs://test-bucket/flights.csv#123456789"

But that leads to the following error:

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "The query specified one or more federated data sources but not all of them were scanned. It usually indicates incorrect uri specification or a 'limit' clause over a union of federated data sources that was satisfied without having to read all sources.",
    "reason" : "invalid"
  } ],
  "message" : "The query specified one or more federated data sources but not all of them were scanned. It usually indicates incorrect uri specification or a 'limit' clause over a union of federated data sources that was satisfied without having to read all sources.",
  "status" : "INVALID_ARGUMENT"
}

When I do not use a generation it works fine. I also checked with gsutils stat gs://test-bucket/flights.csv#123456789 if the file exists with this generation

Is it possible to specify a generation here?

来源:https://stackoverflow.com/questions/57492661/using-external-data-sources-in-bq-with-specific-generation-from-google-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!