Organizing Messy Notepad data

我只是一个虾纸丫 提交于 2019-12-13 04:26:36

问题


I have some data in Notepad that is a mess. There is basically no space between any of the different columns which hold different data. I know the spaces for the data. For example, Columns 1-2 are X, Columns 7-10 are Y....

How can I organize this? Can it be done in R? What is the best way to do this?


回答1:


?read.fwf may be a good bet for this circumstance.

Set the path to the file:

temp <- "\pathto\file.txt"

Then set the widths of the variables within the file, as demonstrated below.

#1-2 = x, 3-10=y
widths <- c(2,8)

Then set the names of the columns.

cols <- c("X","Y")

Finally, import the data into a new variable in your session:

dataset <- read.fwf(temp,widths,header=FALSE,col.names=cols)



回答2:


Something I've done in the past to handle that kind of mess is actually import it into excel as delimited width text, then save as a CSV.

Just a suggestion for you. If it's a one off project then that should be fine. no coding at all. But if it's a repeat offender... then you might look at regular expressions.

i.e. ^(.{6})(.{7})(.{2})(.{5})$ for 4 fields of 6,7,2 and 5 characters width in order.



来源:https://stackoverflow.com/questions/11571148/organizing-messy-notepad-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!