问题
I have tried two ways with Bigrquery package such that
library(bigrquery)
library(DBI)
con <- dbConnect(
bigrquery::bigquery(),
project = "YOUR PROJECT ID HERE",
dataset = "YOUR DATASET"
)
test<- dbGetQuery(con, sql, n = 10000, max_pages = Inf)
and
sql <- `YOUR LARGE QUERY HERE` #long query saved to View and its select here
tb <- bigrquery::bq_project_query(project, sql)
bq_table_download(tb, max_results = 1000)
but failing to the error "Error: Requested Resource Too Large to Return [responseTooLarge]"
, potentially related issue here, but I am interested in any tool to get the job done: I tried already the solutions outlined here but they failed.
How can I load large datasets to R from BigQuery?
回答1:
As @hrbrmstr kind of suggested you, the documentation mentions specifically:
> #' @param page_size The number of rows returned per page. Make this smaller > #' if you have many fields or large records and you are seeing a > #' 'responseTooLarge' error.
In this documentation from r-project.org you will read a different advise in the explanation of this function (page 13):
This retrieves rows in chunks of page_size. It is most suitable for results of smaller queries (<100 MB, say). For larger queries, it is better to export the results to a CSV file stored on google cloud and use the bq command line tool to download locally.
回答2:
I just started using BigQuery too. I think it should be something like this.
The current bigrquery release can be installed from CRAN:
install.packages("bigrquery")
The newest development release can be installed from GitHub:
install.packages('devtools')
devtools::install_github("r-dbi/bigrquery")
Usage Low-level API
library(bigrquery)
billing <- bq_test_project() # replace this with your project ID
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"
tb <- bq_project_query(billing, sql)
#> Auto-refreshing stale OAuth token.
bq_table_download(tb, max_results = 10)
DBI
library(DBI)
con <- dbConnect(
bigrquery::bigquery(),
project = "publicdata",
dataset = "samples",
billing = billing
)
con
#> <BigQueryConnection>
#> Dataset: publicdata.samples
#> Billing: bigrquery-examples
dbListTables(con)
#> [1] "github_nested" "github_timeline" "gsod" "natality"
#> [5] "shakespeare" "trigrams" "wikipedia"
dbGetQuery(con, sql, n = 10)
library(dplyr)
natality <- tbl(con, "natality")
natality %>%
select(year, month, day, weight_pounds) %>%
head(10) %>%
collect()
来源:https://stackoverflow.com/questions/52138048/how-to-load-large-datasets-to-r-from-bigquery