genfromtxt

NumPy genfromtxt: using filling_missing correctly

谁说胖子不能爱 提交于 2020-01-02 02:12:06
问题 I am attempting to process data saved to CSV that may have missing values in an unknown number of columns (up to around 30). I am attempting to set those missing values to '0' using genfromtxt 's filling_missing argument. Here is a minimal working example for numpy 1.6.2 running in ActiveState ActivePython 2.7 32 bit on Win 7. import numpy text = "a,b,c,d\n1,2,3,4\n5,,7,8" a = numpy.genfromtxt('test.txt',delimiter=',',names=True) b = open('test.txt','w') b.write(text) b.close() a = numpy

“Got 1 columns instead of …” error in numpy

心不动则不痛 提交于 2019-12-30 17:27:10
问题 I'm working on the following code for performing Random Forest Classification on train and test sets; from sklearn.ensemble import RandomForestClassifier from numpy import genfromtxt, savetxt def main(): dataset = genfromtxt(open('filepath','r'), delimiter=' ', dtype='f8') target = [x[0] for x in dataset] train = [x[1:] for x in dataset] test = genfromtxt(open('filepath','r'), delimiter=' ', dtype='f8') rf = RandomForestClassifier(n_estimators=100) rf.fit(train, target) predicted_probs = [

Numpy genfromtxt doesn't seem to work when names=True for Python 3

余生长醉 提交于 2019-12-24 21:56:21
问题 I am using the Google Colab enviroment. The file I am using can be found here. It is a csv file https://drive.google.com/open?id=1v7Mm6S8BVtou1iIfobY43LRF8MgGdjfU Warning: it has several million rows. This code runs within a minute in Google Colab Python 3 notebook. I tried this several times with no problem. from numpy import genfromtxt my_data = genfromtxt('DlRefinedRatings.csv', delimiter=',' , dtype=int) print(my_data[0:50]) The code below, on the other hand, runs for several minutes

Importing csv embedding special character with numpy genfromtxt

北城余情 提交于 2019-12-24 10:25:44
问题 I have a CSV containing special characters. Some cells are arithmetic operations (like "(10/2)"). I would like to import these cells as string in numpy by using np.genfromtxt. What I notice is that it actually import them in UTF8 (if I understood). For instance everytime I have a division symbol I get this code in the numpy array :\xc3\xb7 How could I import these arithmetic operations as readable string? Thank you! 回答1: Looks like the file may have the 'other' divide symbol, the one we learn

Skip Rows with missing values in genfromtxt

為{幸葍}努か 提交于 2019-12-24 01:38:48
问题 how can i load a csv. file into an array skipping rows when at least on cell is empty? my csv file is large (over 1000 rows and 14 colums): 1;4;3 ;1;3 ;;6 3;4;7 i want to skip writing row 2 and 3 cause they have missing values (x;1;3) (x;x;6) all the other rows that are complete should be written to an array... These rows (with "full" information in each row should be written to a matrix (array) M = np.genfromtxt(file.csv, delimiter=";",dtype=float) 回答1: It'll probably be easier to read in

-9999 as missing value with numpy.genfromtxt()

送分小仙女□ 提交于 2019-12-24 01:17:03
问题 Lets say I have a dumb text file with the contents: Year Recon Observed 1505 162.38 23 1506 46.14 -9999 1507 147.49 -9999 -9999 is used to denote a missing value (don't ask). So, I should be able to read this into a Numpy array with: import numpy as np x = np.genfromtxt("file.txt", dtype = None, names = True, missing_values = -9999) And have all my little -9999 s turn into numpy.nan. But, I get: >>> x array([(1409, 112.38, 23), (1410, 56.14, -9999), (1411, 145.49, -9999)], dtype=[('Year', '

how to load a file with date and time as a datetime object in python?

大憨熊 提交于 2019-12-23 02:54:41
问题 need to load this file with date in first col and HH:MM in second col. How does it work with a numpy.genfromtxt() ? Maybe pandas? My file looks like: 2017-Feb-11 00:00 m 4.87809 1.86737 5.04236 0.27627 1.5995 2017-Feb-11 00:05 m 4.86722 1.86711 5.00023 0.27616 1.5965 2017-Feb-11 00:10 m 4.85641 1.86690 4.95810 0.27604 1.5941 回答1: In [32]: df = pd.read_csv(filename, delim_whitespace=True, parse_dates=[0], header=None) In [33]: df[1] = pd.to_timedelta(df[1] + ':00') In [34]: df Out[34]: 0 1 2 3

dtype argument in numpy.genfromtxt

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-23 02:40:00
问题 >>> from io import StringIO >>> import numpy as np >>> s = StringIO("1,1.3,abcde") >>> data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'), ... ('mystring','S5')], delimiter=",") >>> data array((1, 1.3, 'abcde'), dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', '|S5')]) My question is related to dtype argument. I am unable to understand what dtype="i8,f8,|S5" stands for. I can make out that i is an integer, f is the float and s is the string but what is 8 in i8? I first

Filling missing values using numpy.genfromtxt

杀马特。学长 韩版系。学妹 提交于 2019-12-21 02:53:28
问题 Despite the advice from the previous questions: -9999 as missing value with numpy.genfromtxt() Using genfromtxt to import csv data with missing values in numpy I still am unable to process a text file that ends with a missing value, a.txt: 1 2 3 4 5 6 7 8 I've tried multiple arrangements of options of missing_values , filling_values and can not get this to work: import numpy as np sol = np.genfromtxt("a.txt", dtype=float, invalid_raise=False, missing_values=None, usemask=True, filling_values

numpy genfromtxt issues in Python3

那年仲夏 提交于 2019-12-18 05:15:09
问题 I'm trying to use genfromtxt with Python3 to read a simple csv file containing strings and numbers. For example, something like (hereinafter "test.csv"): 1,a 2,b 3,c with Python2, the following works well: import numpy data=numpy.genfromtxt("test.csv", delimiter=",", dtype=None) # Now data is something like [(1, 'a') (2, 'b') (3, 'c')] in Python3 the same code returns [(1, b'a') (2, b'b') (3, b'c')] . This is somehow expected due to the different way Python3 reads the files. Therefore I use a