“Got 1 columns instead of …” error in numpy

后端 未结 8 700
傲寒
傲寒 2021-01-17 13:00

I\'m working on the following code for performing Random Forest Classification on train and test sets;

from sklearn.ensemble import RandomForestClassifier
fr         


        
相关标签:
8条回答
  • 2021-01-17 13:19

    genfromtxt will give this error if the number of columns is unequal.

    I can think of 3 ways around it:

    1. Use the usecols parameter

    np.genfromtxt('yourfile.txt',delimiter=',',usecols=np.arange(0,1434))
    

    However - this may mean that you lose some data (where rows are longer than 1434 columns) - whether or not that matters is down to you.

    2. Adjust your input data file so that it has an equal number of columns.

    3. Use something other than genfromtxt:

    .............like this

    0 讨论(0)
  • 2021-01-17 13:19

    It seems like the header that includes the column names have 1 more column than the data itself (1435 columns on header vs. 1434 on data).

    You could either:

    1) Eliminate 1 column from the header that doesn't make sense with data

    OR

    2) Use the skip header from genfromtxt() for example, np.genfromtxt('myfile', skip_header=*how many lines to skip*, delimiter=' ') more information found in the documentation.

    0 讨论(0)
  • 2021-01-17 13:23

    I had this error. The cause was a single entry in my data that had a space. This caused it to see it as an extra row. Make sure all spacing is consistent throughout all the data.

    0 讨论(0)
  • 2021-01-17 13:29

    In my case, the error aroused due to having a special symbol in the row.

    Error cause: having special characters like

    • '#' hash
    • ',' given the fact that your ( delimiter = ',' )

    Example csv file

    • 1,hello,#this,fails
    • 1,hello,',this',fails

      -----CODE-----

      import numpy as numpy data = numpy.genfromtxt(file, delimiter=delimeter) #Error

    Environment Note:

    OS: Ubuntu

    csv editor: LibreOffice

    IDE: Pycharm

    0 讨论(0)
  • 2021-01-17 13:32

    An exception is raised if an inconsistency is detected in the number of columns.A number of reasons and solutions are possible.

    1. Add invalid_raise = False to skip the offending lines.

      dataset = genfromtxt(open('data.csv','r'), delimiter='', invalid_raise = False)

    2. If your data contains Names, make sure that the field name doesn’t contain any space or invalid character, or that it does not correspond to the name of a standard attribute (like size or shape), which would confuse the interpreter.

    1. deletechars

      Gives a string combining all the characters that must be deleted from the name. By default, invalid characters are ~!@#$%^&*()-=+~\|]}[{';: /?.>,<.

    2. excludelist

      Gives a list of the names to exclude, such as return, file, print… If one of the input name is part of this list, an underscore character ('_') will be appended to it.

    3. case_sensitive

      Whether the names should be case-sensitive (case_sensitive=True), converted to upper case (case_sensitive=False or case_sensitive='upper') or to lower case (case_sensitive='lower').

    data = np.genfromtxt("data.txt", dtype=None, names=True,\
           deletechars="~!@#$%^&*()-=+~\|]}[{';: /?.>,<.", case_sensitive=True)
    

    Reference: numpy.genfromtxt

    0 讨论(0)
  • 2021-01-17 13:36

    None of the previous answers worked for me so for future googlers here is another one :

    Error was : "Line #88 (got 1435 columns instead of 1)"

    Discovered that my csv file was a utf8 encoded text file with a BOM(a character marking the encoding on the first line of the file. Most text editors will hide this character)

    I simply opened it in notepad in windows,"saved as" again and selected "ANSI" at the bottom of the save box.

    Fixed it for me.

    0 讨论(0)
提交回复
热议问题