问题
There's a multitude of ways to load Excel files in R, as reported here: http://www.milanor.net/blog/?p=779
However, I've tried most of the options out there (RODBC, xlsx, gdata, XLConnect) and just cannot make R load this particular file, published by the British government in 2013:
http://www.ons.gov.uk/ons/rel/npp/national-population-projections/2012-based-projections/rft-table-a3-4-principal-projection---england-population-single-year-of-age.xls
Here's an example of my failed attempts:
# save the file
download.file("http://www.ons.gov.uk/ons/rel/npp/national-population-projections/2012-based-projections/rft-table-a3-4-principal-projection---england-population-single-year-of-age.xls", destfile = "input-data/future-pop-ons.xls")
library(RODBC)
XLConnect::readWorksheetFromFile(file = "input-data/future-pop-ons.xls", sheet = 3)
## Error: OldExcelFormatException (Java): The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. POI only supports BIFF8 format (from Excel versions 97/2000/XP/2003)
library(XLConnect)
XLConnect::readWorksheet("input-data/future-pop-ons.xls", sheet = 3)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘readWorksheet’ for signature ‘"character", "numeric"’
library(gdata)
read.xls(xls = "input-data/future-pop-ons.xls", sheet = 3)
Use of uninitialized value $format in substitution (s///) at /home/robin/R/i686-pc-linux-gnu-library/3.1/gdata/perl/Spreadsheet/ParseExcel/Utility.pm line 183.
Would be great to be able to load this type of file directly, ensuring reproducibility.
回答1:
This works for me on a rocker/rstudio container, which I'm using for a high level of isolation and reproducibility:
download.file("http://www.ons.gov.uk/ons/rel/npp/national-population-projections/2012-based-projections/rft-table-a3-4-principal-projection---england-population-single-year-of-age.xls",
destfile = "future-pop-ons.xls",
method = "wget")
library(gdata)
xx <- read.xls(xls = "future-pop-ons.xls", sheet = 3, fileEncoding="latin1")
There's a lot of warning-like output in the console, but the full sheet reads in, and that's the main thing.
来源:https://stackoverflow.com/questions/26724923/impossibility-of-loading-a-excel-5-0-7-0-biff5-format-file-in-r