genfromtxt | 易学教程

NumPy genfromtxt: using filling_missing correctly

阅读更多关于 NumPy genfromtxt: using filling_missing correctly

问题 I am attempting to process data saved to CSV that may have missing values in an unknown number of columns (up to around 30). I am attempting to set those missing values to '0' using genfromtxt 's filling_missing argument. Here is a minimal working example for numpy 1.6.2 running in ActiveState ActivePython 2.7 32 bit on Win 7. import numpy text = "a,b,c,d\n1,2,3,4\n5,,7,8" a = numpy.genfromtxt('test.txt',delimiter=',',names=True) b = open('test.txt','w') b.write(text) b.close() a = numpy

“Got 1 columns instead of …” error in numpy

阅读更多关于 “Got 1 columns instead of …” error in numpy

问题 I'm working on the following code for performing Random Forest Classification on train and test sets; from sklearn.ensemble import RandomForestClassifier from numpy import genfromtxt, savetxt def main(): dataset = genfromtxt(open('filepath','r'), delimiter=' ', dtype='f8') target = [x[0] for x in dataset] train = [x[1:] for x in dataset] test = genfromtxt(open('filepath','r'), delimiter=' ', dtype='f8') rf = RandomForestClassifier(n_estimators=100) rf.fit(train, target) predicted_probs = [

Numpy genfromtxt doesn't seem to work when names=True for Python 3

阅读更多关于 Numpy genfromtxt doesn't seem to work when names=True for Python 3

问题 I am using the Google Colab enviroment. The file I am using can be found here. It is a csv file https://drive.google.com/open?id=1v7Mm6S8BVtou1iIfobY43LRF8MgGdjfU Warning: it has several million rows. This code runs within a minute in Google Colab Python 3 notebook. I tried this several times with no problem. from numpy import genfromtxt my_data = genfromtxt('DlRefinedRatings.csv', delimiter=',' , dtype=int) print(my_data[0:50]) The code below, on the other hand, runs for several minutes

Importing csv embedding special character with numpy genfromtxt

阅读更多关于 Importing csv embedding special character with numpy genfromtxt

问题 I have a CSV containing special characters. Some cells are arithmetic operations (like "(10/2)"). I would like to import these cells as string in numpy by using np.genfromtxt. What I notice is that it actually import them in UTF8 (if I understood). For instance everytime I have a division symbol I get this code in the numpy array :\xc3\xb7 How could I import these arithmetic operations as readable string? Thank you! 回答1: Looks like the file may have the 'other' divide symbol, the one we learn

Skip Rows with missing values in genfromtxt

阅读更多关于 Skip Rows with missing values in genfromtxt

问题 how can i load a csv. file into an array skipping rows when at least on cell is empty? my csv file is large (over 1000 rows and 14 colums): 1;4;3 ;1;3 ;;6 3;4;7 i want to skip writing row 2 and 3 cause they have missing values (x;1;3) (x;x;6) all the other rows that are complete should be written to an array... These rows (with "full" information in each row should be written to a matrix (array) M = np.genfromtxt(file.csv, delimiter=";",dtype=float) 回答1: It'll probably be easier to read in

-9999 as missing value with numpy.genfromtxt()

阅读更多关于 -9999 as missing value with numpy.genfromtxt()

问题 Lets say I have a dumb text file with the contents: Year Recon Observed 1505 162.38 23 1506 46.14 -9999 1507 147.49 -9999 -9999 is used to denote a missing value (don't ask). So, I should be able to read this into a Numpy array with: import numpy as np x = np.genfromtxt("file.txt", dtype = None, names = True, missing_values = -9999) And have all my little -9999 s turn into numpy.nan. But, I get: >>> x array([(1409, 112.38, 23), (1410, 56.14, -9999), (1411, 145.49, -9999)], dtype=[('Year', '

how to load a file with date and time as a datetime object in python?

阅读更多关于 how to load a file with date and time as a datetime object in python?

问题 need to load this file with date in first col and HH:MM in second col. How does it work with a numpy.genfromtxt() ? Maybe pandas? My file looks like: 2017-Feb-11 00:00 m 4.87809 1.86737 5.04236 0.27627 1.5995 2017-Feb-11 00:05 m 4.86722 1.86711 5.00023 0.27616 1.5965 2017-Feb-11 00:10 m 4.85641 1.86690 4.95810 0.27604 1.5941 回答1: In [32]: df = pd.read_csv(filename, delim_whitespace=True, parse_dates=[0], header=None) In [33]: df[1] = pd.to_timedelta(df[1] + ':00') In [34]: df Out[34]: 0 1 2 3

dtype argument in numpy.genfromtxt

阅读更多关于 dtype argument in numpy.genfromtxt

问题 >>> from io import StringIO >>> import numpy as np >>> s = StringIO("1,1.3,abcde") >>> data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'), ... ('mystring','S5')], delimiter=",") >>> data array((1, 1.3, 'abcde'), dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', '|S5')]) My question is related to dtype argument. I am unable to understand what dtype="i8,f8,|S5" stands for. I can make out that i is an integer, f is the float and s is the string but what is 8 in i8? I first

Filling missing values using numpy.genfromtxt

阅读更多关于 Filling missing values using numpy.genfromtxt

问题 Despite the advice from the previous questions: -9999 as missing value with numpy.genfromtxt() Using genfromtxt to import csv data with missing values in numpy I still am unable to process a text file that ends with a missing value, a.txt: 1 2 3 4 5 6 7 8 I've tried multiple arrangements of options of missing_values , filling_values and can not get this to work: import numpy as np sol = np.genfromtxt("a.txt", dtype=float, invalid_raise=False, missing_values=None, usemask=True, filling_values

numpy genfromtxt issues in Python3

阅读更多关于 numpy genfromtxt issues in Python3

问题 I'm trying to use genfromtxt with Python3 to read a simple csv file containing strings and numbers. For example, something like (hereinafter "test.csv"): 1,a 2,b 3,c with Python2, the following works well: import numpy data=numpy.genfromtxt("test.csv", delimiter=",", dtype=None) # Now data is something like [(1, 'a') (2, 'b') (3, 'c')] in Python3 the same code returns [(1, b'a') (2, b'b') (3, b'c')] . This is somehow expected due to the different way Python3 reads the files. Therefore I use a