问题
I'm quite stuck with a code I'm writing in Python, I'm a beginner and maybe is really easy, but I just can't see it. Any help would be appreciated. So thank you in advance :)
Here is the problem: I have to read some special data files with an special extension .fen into a pandas DataFrame.This .fen files are inside a zipped file .fenx that contains the .fen file and a .cfg configuration file.
In the code I've written I use zipfile library in order to unzip the files, and then get them in the DataFrame. This code is the following
import zipfile
import numpy as np
import pandas as pd
def readfenxfile(Directory,File):
fenxzip = zipfile.ZipFile(Directory+ '\\' + File, 'r')
fenxzip.extractall()
fenxzip.close()
cfgGeneral,cfgDevice,cfgChannels,cfgDtypes=readCfgFile(Directory,File[:-5]+'.CFG')
#readCfgFile redas the .cfg file and returns some important data.
#Here only the cfgDtypes would be important as it contains the type of data inside the .fen and that will become the column index in the final DataFrame.
if cfgChannels!=None:
dtDtype=eval('np.dtype([' + cfgDtypes + '])')
dt=np.fromfile(Directory+'\\'+File[:-5]+'.fen',dtype=dtDtype)
dt=pd.DataFrame(dt)
else:
dt=[]
return dt,cfgChannels,cfgDtypes
Now, the extract() method saves the unzipped file in the hard drive. The .fenx files can be quite big so this need of storing (and afterwards deleting them) is really slow. I would like to do the same I do now, but getting the .fen and .cfg files into the memory, not the hard drive.
I have tried things like fenxzip.read('whateverthenameofthefileis.fen')
and some other methods like .open()
from the zipfile library. But I can't get what .read()
returns into a numpy array in anyway i tried.
I know it can be a difficult question to answer, because you don't have the files to try and see what happens. But if someone would have any ideas I would be glad of reading them. :) Thank you very much!
回答1:
Here is the solution I finally found in case it can be helpful for anyone. It uses the tempfile library to create a temporal object in memory.
import zipfile
import tempfile
import numpy as np
import pandas as pd
def readfenxfile(Directory,File,ExtractDirectory):
fenxzip = zipfile.ZipFile(Directory+ r'\\' + File, 'r')
fenfile=tempfile.SpooledTemporaryFile(max_size=10000000000,mode='w+b')
fenfile.write(fenxzip.read(File[:-5]+'.fen'))
cfgGeneral,cfgDevice,cfgChannels,cfgDtypes=readCfgFile(fenxzip,File[:-5]+'.CFG')
if cfgChannels!=None:
dtDtype=eval('np.dtype([' + cfgDtypes + '])')
fenfile.seek(0)
dt=np.fromfile(fenfile,dtype=dtDtype)
dt=pd.DataFrame(dt)
else:
dt=[]
fenfile.close()
fenxzip.close()
return dt,cfgChannels,cfgDtypes
来源:https://stackoverflow.com/questions/43258102/python-unziping-special-files-into-memory-and-getting-them-into-a-dataframe