tab-delimited

Pandas seems to ignore first column name when reading tab-delimited data, gives KeyError

两盒软妹~` 提交于 2019-11-30 18:59:05
I am using pandas 0.12.0 in ipython3 on Ubuntu 13.10, in order to wrangle large tab-delimited datasets in txt files. Using read_table to create a DataFrame from the txt appears to work, and the first row is read as a header, but attempting to access the first column using its name as an index throws a KeyError. I don't understand why this happens, given that the column names all appear to have been read correctly, and every other column can be indexed in this way. The data looks like this: RECORDING_SESSION_LABEL LEFT_GAZE_X LEFT_GAZE_Y RIGHT_GAZE_X RIGHT_GAZE_Y VIDEO_FRAME_INDEX VIDEO_NAME 73

python pandas read_csv not recognizing \\t in tab delimited file

烂漫一生 提交于 2019-11-30 14:15:36
I'm trying to read in the following tab separated data into pandas: test.txt: col_a\tcol_b\tcol_c\tcol_d 4\t3\t2\t1 4\t3\t2\t1 I import test.txt as follows: pd.read_csv('test.txt',sep='\t') The resulting dataframe has 1 column. The \t is not recognized as tab. If I replace \t with a 'keyboard tab' the file is parsed correctly. I also tried replacing '\t with \t and /t and didn't have any luck. Thanks in advance for your help. Omar PS: Screenshot http://imgur.com/a/nXvW3 The \t in your file is an actual backslash followed by a t . It is not a tab . You're going to have to use some escape

Pandas seems to ignore first column name when reading tab-delimited data, gives KeyError

我的梦境 提交于 2019-11-30 03:08:47
问题 I am using pandas 0.12.0 in ipython3 on Ubuntu 13.10, in order to wrangle large tab-delimited datasets in txt files. Using read_table to create a DataFrame from the txt appears to work, and the first row is read as a header, but attempting to access the first column using its name as an index throws a KeyError. I don't understand why this happens, given that the column names all appear to have been read correctly, and every other column can be indexed in this way. The data looks like this:

python pandas read_csv not recognizing \t in tab delimited file

北城以北 提交于 2019-11-29 20:34:53
问题 I'm trying to read in the following tab separated data into pandas: test.txt: col_a\tcol_b\tcol_c\tcol_d 4\t3\t2\t1 4\t3\t2\t1 I import test.txt as follows: pd.read_csv('test.txt',sep='\t') The resulting dataframe has 1 column. The \t is not recognized as tab. If I replace \t with a 'keyboard tab' the file is parsed correctly. I also tried replacing '\t with \t and /t and didn't have any luck. Thanks in advance for your help. Omar PS: Screenshot http://imgur.com/a/nXvW3 回答1: The \t in your

Reading tab-delimited file with Pandas - works on Windows, but not on Mac

旧街凉风 提交于 2019-11-28 17:10:21
I've been reading a tab-delimited data file in Windows with Pandas/Python without any problems. The data file contains notes in first three lines and then follows with a header. df = pd.read_csv(myfile,sep='\t',skiprows=(0,1,2),header=(0)) I'm now trying to read this file with my Mac. (My first time using Python on Mac.) I get the following error. pandas.parser.CParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 39 If set the error_bad_lines argument for read_csv to False , I get the following information, which continues until the end of the last row. Skipping line

Reading tab-delimited file with Pandas - works on Windows, but not on Mac

对着背影说爱祢 提交于 2019-11-27 09:49:21
问题 I've been reading a tab-delimited data file in Windows with Pandas/Python without any problems. The data file contains notes in first three lines and then follows with a header. df = pd.read_csv(myfile,sep='\t',skiprows=(0,1,2),header=(0)) I'm now trying to read this file with my Mac. (My first time using Python on Mac.) I get the following error. pandas.parser.CParserError: Error tokenizing data. C error: Expected 1 fields in line 8, saw 39 If set the error_bad_lines argument for read_csv to

String parsing in Java with delimiter tab “\\t” using split

﹥>﹥吖頭↗ 提交于 2019-11-26 18:54:46
I'm processing a string which is tab delimited. I'm accomplishing this using the split function, and it works in most situations. The problem occurs when a field is missing, so instead of getting null in that field I get the next value. I'm storing the parsed values in a string array. String[] columnDetail = new String[11]; columnDetail = column.split("\t"); Any help would be appreciated. If possible I'd like to store the parsed strings into a string array so that I can easily access the parsed data. Filip Ekberg String.split uses Regular Expressions , also you don't need to allocate an extra

Sorting a tab delimited file

非 Y 不嫁゛ 提交于 2019-11-26 18:11:20
I have a data with the following format: foo<tab>1.00<space>1.33<space>2.00<tab>3 Now I tried to sort the file based on the last field decreasingly. I tried the following commands but it wasn't sorted as we expected. $ sort -k3nr file.txt # apparently this sort by space as delimiter $ sort -t"\t" -k3nr file.txt sort: multi-character tab `\\t' $ sort -t "`/bin/echo '\t'`" -k3,3nr file.txt sort: multi-character tab `\\t' What's the right way to do it? Here is the sample data . Using bash , this will do the trick: $ sort -t$'\t' -k3 -nr file.txt Notice the dollar sign in front of the single

String parsing in Java with delimiter tab “\t” using split

心已入冬 提交于 2019-11-26 06:39:36
问题 I\'m processing a string which is tab delimited. I\'m accomplishing this using the split function, and it works in most situations. The problem occurs when a field is missing, so instead of getting null in that field I get the next value. I\'m storing the parsed values in a string array. String[] columnDetail = new String[11]; columnDetail = column.split(\"\\t\"); Any help would be appreciated. If possible I\'d like to store the parsed strings into a string array so that I can easily access

selecting across multiple columns with python pandas?

拜拜、爱过 提交于 2019-11-26 05:27:28
问题 I have a dataframe df in pandas that was built using pandas.read_table from a csv file. The dataframe has several columns and it is indexed by one of the columns (which is unique, in that each row has a unique value for that column used for indexing.) How can I select rows of my dataframe based on a \"complex\" filter applied to multiple columns? I can easily select out the slice of the dataframe where column colA is greater than 10 for example: df_greater_than10 = df[df[\"colA\"] > 10] But