问题
I'm learning how to extract data from links and then proceeding to graph them.
For this tutorial, I was using the yahoo dataset of a stock.
The code is as follows
import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
import datetime
def bytespdate2num(fmt, encoding='utf-8'):
strconverter = mdates.strpdate2num(fmt)
def bytesconverter(b):
s = b.decode(encoding)
return strconverter(s)
return bytesconverter
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
split_source=source_code.split('\n')
print(len(split_source))
for line in split_source:
split_line=line.split(',')
if (len(split_line)==7):
stock_data.append(line)
date,openn,closep,highp,lowp,openp,volume=np.loadtxt(stock_data,delimiter=',',unpack=True,converters={0:bytespdate2num('%Y-%m-%d')})
plt.plot_date(date,closep)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Graph')
plt.show()
graph_data('TSLA')
The whole code is pretty easy to understand except the part of converting the string datatype into date format using bytesupdate2num function.
Is there an easier way to convert strings extracted from reading a URL into date format during numpy extraction or is there another method I can use.
Thank you
回答1:
With a guess as to the csv format, I can use the numpy
'native' datetime dtype:
In [183]: txt = ['2020-10-23 1 2.3']*3
In [184]: txt
Out[184]: ['2020-10-23 1 2.3', '2020-10-23 1 2.3', '2020-10-23 1 2.3']
If I let genfromtxt
do its own dtype
conversions:
In [187]: np.genfromtxt(txt, dtype=None, encoding=None)
Out[187]:
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
('2020-10-23', 1, 2.3)],
dtype=[('f0', '<U10'), ('f1', '<i8'), ('f2', '<f8')])
the date column is rendered as a string.
If I specify a datetime64
format:
In [188]: np.array('2020-10-23', dtype='datetime64[D]')
Out[188]: array('2020-10-23', dtype='datetime64[D]')
In [189]: np.genfromtxt(txt, dtype=['datetime64[D]',int,float], encoding=None)
Out[189]:
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
('2020-10-23', 1, 2.3)],
dtype=[('f0', '<M8[D]'), ('f1', '<i8'), ('f2', '<f8')])
This date appears to work in plt
In [190]: plt.plot_date(_['f0'], _['f1'])
I used genfromtxt
because I'm more familiar with its ability to handle dtypes.
来源:https://stackoverflow.com/questions/60718336/converting-string-to-date-in-numpy-unpack