Organizing Messy Notepad data

问题

I have some data in Notepad that is a mess. There is basically no space between any of the different columns which hold different data. I know the spaces for the data. For example, Columns 1-2 are X, Columns 7-10 are Y....

How can I organize this? Can it be done in R? What is the best way to do this?

回答1:

?read.fwf may be a good bet for this circumstance.

Set the path to the file:

temp <- "\pathto\file.txt"

Then set the widths of the variables within the file, as demonstrated below.

#1-2 = x, 3-10=y
widths <- c(2,8)

Then set the names of the columns.

cols <- c("X","Y")

Finally, import the data into a new variable in your session:

dataset <- read.fwf(temp,widths,header=FALSE,col.names=cols)

回答2:

Something I've done in the past to handle that kind of mess is actually import it into excel as delimited width text, then save as a CSV.

Just a suggestion for you. If it's a one off project then that should be fine. no coding at all. But if it's a repeat offender... then you might look at regular expressions.

i.e. ^(.{6})(.{7})(.{2})(.{5})$ for 4 fields of 6,7,2 and 5 characters width in order.

来源：https://stackoverflow.com/questions/11571148/organizing-messy-notepad-data

标签

sorting

text-files

notepad

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!