问题
So, I have a .csv file which updates itself. I would like to do some things with it and am not sure how to approach it, hope you can help me. The data in the csv looks like this: There is no headers. I can join the date and time to be in same column without a delimiter too.
07/12/2017,23:50,113.179,113.182,113.168,113.180,113.187,113.189,113.176,113.186,144
07/12/2017,23:51,113.180,113.190,113.180,113.187,113.186,113.196,113.186,113.193,175
07/12/2017,23:52,113.187,113.188,113.174,113.186,113.193,113.194,113.181,113.192,340
07/12/2017,23:53,113.186,113.192,113.175,113.181,113.192,113.199,113.182,113.188,282
07/12/2017,23:54,113.181,113.183,113.170,113.171,113.188,113.188,113.176,113.179,74
07/12/2017,23:55,113.171,113.181,113.170,113.179,113.179,113.188,113.176,113.186,329
07/12/2017,23:56,113.179,113.189,113.174,113.181,113.186,113.195,113.181,113.187,148
07/12/2017,23:57,113.181,113.181,113.169,113.169,113.187,113.187,113.175,113.175,55
07/12/2017,23:58,113.169,113.183,113.169,113.182,113.175,113.188,113.175,113.187,246
07/12/2017,23:59,113.182,113.210,113.175,113.203,113.187,113.215,113.181,113.209,378
08/12/2017,00:00,113.203,113.213,113.180,113.183,113.209,113.220,113.187,113.190,651
08/12/2017,00:01,113.183,113.190,113.164,113.167,113.190,113.196,113.171,113.174,333
08/12/2017,00:02,113.167,113.182,113.156,113.156,113.174,113.188,113.162,113.163,265
08/12/2017,00:03,113.156,113.165,113.151,113.163,113.163,113.172,113.158,113.170,222
08/12/2017,00:04,113.163,113.163,113.154,113.159,113.170,113.170,113.159,113.166,148
08/12/2017,00:05,113.159,113.163,113.153,113.154,113.166,113.168,113.159,113.162,162
For starters I would be interested in using just the first two (or 3 if date and time are separate) columns for this exercise. So for example:
07/12/2017,21:54,113.098
07/12/2017,21:55,113.096
07/12/2017,21:56,113.087
07/12/2017,21:57,113.075
07/12/2017,21:58,113.087
07/12/2017,21:59,113.079
New rows are being added with more recent date time every second or so. I can do something like
df = pd.read_csv("C:\\Users\\xxx\\Desktop\\csvexport\\thefile.csv")
print(df[-1:])
To see the last row (tail) from the dataframe
Now, I can't see how to do the following and appreciate your help:
Update the dataframe so that I have the most recent version up to date available to make calculations on when new rows appear (without using sleep timer?)
Be able to plot the data with the newly updating data being reflected in the plot automatically as new data arrives (datetime on x axis, float on y)
The output I see in the command window from the program generating the .csv file is like this, if that matters
asset 08/12/2017 05:16:37 float:113.336 floattwo:113.328 digit:20
asset 08/12/2017 05:16:40 float:113.334 floattwo:113.328 digit:21
asset 08/12/2017 05:16:40 float:113.335 floattwo:113.323 digit:22
asset 08/12/2017 05:16:41 float:113.331 floattwo:113.328 digit:23
asset 08/12/2017 05:16:43 float:113.334 floattwo:113.327 digit:24
asset 08/12/2017 05:16:47 float:113.332 floattwo:113.328 digit:25
So you can see the updates are not exactly one second apart, they can have gaps, and can sometimes occur within the same second too (05:16:40 twice)
Therefore, what I would like to happen is keep the plot at equal time intervals actually (1 minute, or 5 minutes, etc) but keep changing the most recent point according to the float vlaue in the .csv belonging to that minute. When a row with the next minute arrives, only then should the plot move to the right (but constantly fluctuate in value as the float number is changing)... Hope you get the idea. I would like to use pyqtgraph for the plot.
I managed to code this much... but it is not the greatest example, excuse me. Of course the plot is not meant to look like this. Just illustrating what I would like to see. So the green bar should be changing value constantly until the next time step is added to the csv
import pyqtgraph as pg
from pyqtgraph import QtCore, QtGui
import pandas as pd
import datetime
x = pd.read_csv("C:\\Users\\xxx\\Desktop\\csvexport\\thefile.csv")
z = x[-1:]
def getlastrow():
for a in z.iterrows():
d = ((int(((a[1][0]).split("/")[0]))))
m = ((int(((a[1][0]).split("/")[1]))))
y = ((int(((a[1][0]).split("/")[2]))))
hh = ((int(((a[1][1]).split(":")[0]))))
mm = ((int(((a[1][1]).split(":")[1]))))
#ss = ((int(((a[1][1]).split(":")[2]))))
thedate = datetime.date(y, m, d)
thetime = datetime.time(hh, mm)
p = (a[1][2])
return ((thedate,thetime,p))
# print(str(getlastrow()[0]).replace("-",""))
# print(getlastrow()[1])
# print(getlastrow()[2])
class CandlestickItem(pg.GraphicsObject):
def __init__(self):
pg.GraphicsObject.__init__(self)
self.flagHasData = False
def set_data(self, data):
self.data = data
self.flagHasData = True
self.generatePicture()
self.informViewBoundsChanged()
def generatePicture(self):
self.picture = QtGui.QPicture()
p = QtGui.QPainter(self.picture)
p.setPen(pg.mkPen('w'))
w = (self.data[1][0] - self.data[0][0]) / 2.
for (t, open) in self.data:
p.drawLine(QtCore.QPointF(t, open), QtCore.QPointF(t, open))
p.setBrush(pg.mkBrush('r'))
if open > 122.8:
p.setBrush(pg.mkBrush('g'))
p.drawRect(QtCore.QRectF(t-w, open, w*2, open))
p.end()
def paint(self, p, *args):
if self.flagHasData:
p.drawPicture(0, 0, self.picture)
def boundingRect(self):
return QtCore.QRectF(self.picture.boundingRect())
app = QtGui.QApplication([])
data = [
[(int(str(getlastrow()[0]).replace("-",""))), (getlastrow()[2])],
[(int(str(getlastrow()[0]).replace("-","")))+1, (getlastrow()[2])+0.1],
[(int(str(getlastrow()[0]).replace("-","")))+2, (getlastrow()[2])+0.2],
]
item = CandlestickItem()
item.set_data(data)
plt = pg.plot()
plt.addItem(item)
plt.setWindowTitle('pyqtgraph example: customGraphicsItem')
def update():
global item, data
new_bar = (int(str(getlastrow()[0]).replace("-","")))+3, ((getlastrow()[2])+10)
data.append(new_bar)
item.set_data(data)
app.processEvents()
timer = QtCore.QTimer()
timer.timeout.connect(update)
timer.start(100)
if __name__ == '__main__':
import sys
if (sys.flags.interactive != 1) or not hasattr(QtCore, 'PYQT_VERSION'):
QtGui.QApplication.instance().exec_()
回答1:
Hopefully the code below will help with point(1). I realise this is a partial answer. I tested using Linux. The code should be OS agnostic, but I have not tested this.
The code monitors the directory defined in TEST_DIR using the watchdog library. If the file defined in TEST_FILE is changed, then a message is sent from the event handling class called MyHandler to the main function. I put in some ugly time checking as each time a file is altered, multiple events are triggered. So only a single dispatch will be triggered for events occurring within THRESHOLD time. I set this to 0.01 s.
Add code to the dispatcher_receiver function to read in the updated file.
import ntpath
# pip3 install pydispatcher --user
from pydispatch import dispatcher
import sys
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
MYHANDLER_SENDER = 'myhandler_sender'
MYHANDLER_SIGNAL = 'myhandler_signal'
TEST_FILE = 'test_data.csv'
TEST_DIR = '/home/bill/data/documents/infolab2/progs/jupyter_notebooks/pyqtgraph/test_data/'
THRESHOLD_TIME = 0.01
class MyHandler(FileSystemEventHandler):
''' handle events from the file system '''
def __init__(self):
self.start_time = time.time()
def on_modified(self, event):
now_time = time.time()
# filter out multiple modified events occuring for a single file operation
if (now_time - self.start_time) < THRESHOLD_TIME:
print('repeated event, not triggering')
return
changed_file = ntpath.basename(event.src_path)
if changed_file == TEST_FILE:
print('changed file: {}'.format(changed_file))
print('event type: {}'.format(event.event_type))
print('do something...')
# print(event)
message = '{} changed'.format(changed_file)
dispatcher.send(message=message, signal=MYHANDLER_SIGNAL, sender=MYHANDLER_SENDER)
self.start_time = now_time
def main():
dispatcher.connect(dispatcher_receive, signal=MYHANDLER_SIGNAL, sender=MYHANDLER_SENDER)
observer = Observer()
observer.schedule(event_handler, path=TEST_DIR, recursive=False)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
def dispatcher_receive(message):
print('received dispatch: {}'.format(message))
# read in the altered file
if __name__ == "__main__":
event_handler = MyHandler()
main()
来源:https://stackoverflow.com/questions/47709010/read-and-plot-real-time-live-self-updating-csv-file