I have some cvs data that has an empty column at the end of each row. I would like to leave it out of the import or alternatively delete it after import. My cvs data\'s have a v
Another method to delete last column in DataFrame df:
df = df.iloc[:, :-1]
As with all index based operations in Python, you can use -1 to start from the end.
df.drop(df.columns[-1], axis=1, inplace=True)
Here's a one-liner that does not require specifying the column name
df.drop(df.columns[len(df.columns)-1], axis=1, inplace=True)
Just to complete the accepted answer, if you have one dataframe with only two columns, and these two columns have the same name, be aware. First you need to rename one column and then drop the desirable column.
*Edited after comment below
You can specify which columns to import using usecols
parameter for read_csv
So either create a list of column names or integer values:
cols_to_use = ['col1', 'col2'] # or [0,1,2,3]
df = pd.read_csv('mycsv.csv', usecols= cols_to_use)
or drop the column after importing, I prefer the former method (why import data you are not interested in?).
df = df.drop(labels='column_to_delete', axis=1) # axis 1 drops columns, 0 will drop rows that match index value in labels
Note also you misunderstand what tail does, it returns the last n
rows (default is 5) of a dataframe.
Additional
If the columns are varying length then you can just the header to get the columns and then read the csv again properly and drop the last column:
def df_from_csv(path):
df = read_csv(path, nrows=1) # read just first line for columns
columns = df.columns.tolist() # get the columns
cols_to_use = columns[:len(columns)-1] # drop the last one
df = read_csv(path, usecols=cols_to_use)
return df
After importing the data you could drop the last column whatever it is with:
employment = employment.drop(columns = employment.columns[-1])