问题
when I use the following np.loadtxt code to load the data of the format:
2017-07-26,153.3500,153.9300,153.0600,153.5000,153.5000,12778195.00
the data gets loaded just fine, loadtxt code->
a, b, c, d, e, f, g = np.loadtxt("goog.csv",
dtype={'names': ("b'Date", 'Open', 'High', 'Low', 'Close', 'Adjusted_close', 'Volume'),
'formats': ('U10', np.float, np.float, np.float, np.float, np.float, np.float)},
delimiter=',',
skiprows=1,
unpack=True)
print(a)
Output->
['2017-07-26' '2017-07-25' '2017-07-24' ..., '2000-01-05' '2000-01-04'
'2000-01-03']
Process finished with exit code 0
BUT upon using the corresponding np.genfromtxt code gives the ValueError: too many values to unpack, I used the following genfromtxt code->
a, b, c, d, e, f, g = np.genfromtxt('goog.csv',
dtype={'names': ("b'Date", 'Open', 'High', 'Low', 'Close', 'Adjusted_close', 'Volume'),
'formats': ('U10', np.float, np.float, np.float, np.float, np.float, np.float)},
delimiter=',',
skip_header=1,
unpack=True)
print(a)
Output->
Traceback (most recent call last):
File "C:/Users/sonika jha/PycharmProjects/csvCheck/csvCheck.py", line 84, in <module>
download_stock_data()
File "C:/Users/sonika jha/PycharmProjects/csvCheck/csvCheck.py", line 66, in download_stock_data
unpack=True)
ValueError: too many values to unpack (expected 7)
Process finished with exit code 1
My final goal was to load the date in string datatype and the rest in float using genfromtxt.
回答1:
loadtxt
and genfromtxt
handle unpacking from structured data differently
loadtxt
docs:
unpack : bool, optional
If True, the returned array is transposed, so that arguments may be unpacked using
x, y, z = loadtxt(...)
. When used with a structured data-type, arrays are returned for each field. Default is False.
genfromtxt
docs:
unpack : bool, optional
If True, the returned array is transposed, so that arguments may be unpacked using
x, y, z = loadtxt(...)
The loadtxt
in this last quote is a typo.
If I replicate your sample line 3 times, and run genfromtxt
(with unpack=False
):
I get a (3,) array with the defined dtype
:
In [327]: data
Out[327]:
array([('2017-07-26', 153.35, 153.93, 153.06, 153.5, 153.5, 12778195.),
('2017-07-26', 153.35, 153.93, 153.06, 153.5, 153.5, 12778195.),
('2017-07-26', 153.35, 153.93, 153.06, 153.5, 153.5, 12778195.)],
dtype=[('bDate', '<U10'), ('Open', '<f8'), ('High', '<f8'), ('Low', '<f8'), ('Close', '<f8'), ('Adjusted_close', '<f8'), ('Volume', '<f8')])
loadtxt
produces the same thing
But loadtxt
with unpack
ends up doing
a = data['bDate`]
b = data['Open']
etc.
that is, assigning one field to each of the variables.
But genfromtxt
does
a = data[0]
b = data[1]
etc
That is, one row or element of the 1d array to each variable. With many more elements than your 7 variables, it complains about to too many values to unpack.
So either stick with loadtxt
, or don't use unpack
with genfromtxt
.
I think loading the structured array, without unpack
gives you more options when doing further processing.
来源:https://stackoverflow.com/questions/51076408/np-loadtxt-vs-np-genfromtxt