Converting string to date in numpy unpack

喜夏-厌秋 提交于 2020-04-17 20:27:40

问题


I'm learning how to extract data from links and then proceeding to graph them.

For this tutorial, I was using the yahoo dataset of a stock.

The code is as follows


import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
import datetime

def bytespdate2num(fmt, encoding='utf-8'):
    strconverter = mdates.strpdate2num(fmt)
    def bytesconverter(b):
        s = b.decode(encoding)
        return strconverter(s)
    return bytesconverter


def graph_data(stock):
    stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
    source_code = urllib.request.urlopen(stock_price_url).read().decode()

    stock_data = []
    split_source=source_code.split('\n')

    print(len(split_source))

    for line in split_source:
        split_line=line.split(',')
        if (len(split_line)==7):
            stock_data.append(line)


    date,openn,closep,highp,lowp,openp,volume=np.loadtxt(stock_data,delimiter=',',unpack=True,converters={0:bytespdate2num('%Y-%m-%d')})

    plt.plot_date(date,closep)
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title('Graph')
    plt.show()

graph_data('TSLA')

The whole code is pretty easy to understand except the part of converting the string datatype into date format using bytesupdate2num function.

Is there an easier way to convert strings extracted from reading a URL into date format during numpy extraction or is there another method I can use.

Thank you


回答1:


With a guess as to the csv format, I can use the numpy 'native' datetime dtype:

In [183]: txt = ['2020-10-23 1 2.3']*3                                                                               
In [184]: txt                                                                                                        
Out[184]: ['2020-10-23 1 2.3', '2020-10-23 1 2.3', '2020-10-23 1 2.3']

If I let genfromtxt do its own dtype conversions:

In [187]: np.genfromtxt(txt, dtype=None, encoding=None)                                                              
Out[187]: 
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
       ('2020-10-23', 1, 2.3)],
      dtype=[('f0', '<U10'), ('f1', '<i8'), ('f2', '<f8')])

the date column is rendered as a string.

If I specify a datetime64 format:

In [188]: np.array('2020-10-23', dtype='datetime64[D]')                                                              
Out[188]: array('2020-10-23', dtype='datetime64[D]')

In [189]: np.genfromtxt(txt, dtype=['datetime64[D]',int,float], encoding=None)                                       
Out[189]: 
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
       ('2020-10-23', 1, 2.3)],
      dtype=[('f0', '<M8[D]'), ('f1', '<i8'), ('f2', '<f8')])

This date appears to work in plt

In [190]: plt.plot_date(_['f0'], _['f1'])       

I used genfromtxt because I'm more familiar with its ability to handle dtypes.



来源:https://stackoverflow.com/questions/60718336/converting-string-to-date-in-numpy-unpack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!