I know my title is confusing in the sense that the tokenize
command is specified to a string.
I have many folders that contain massive, separated, ill-named Excel files (most of them are scraped from ahe website). It's inconvenient to select them manually so I need to rely on Stata extended macro function local :dir
to read them.
My code looks as follows:
foreach file of local filelist {
import excel "`file'", clear
sxpose, clear
save "`file'.dta", replace
}
Such code will generate many new dta
files and the directory is thus full of these files. I prefer to create a single new data file for the first xlsx
file and then append
others to it inside the foreach
loop. So essentially, there's a if-else
inside the loop.
We need an index of the macro filelist
just created, so that we can write something like:
token `filelist' // filelist is created in the former code
if "`i'" == `1' {
import excel "`file'",clear
}
else {
append using `i',clear
}
I know my code is inefficient and error-prone: the syntax of expression token 'filelist'
is incorrect too (given that filelist
is not a string). However, I still want to figure out the basic structure behind my pseudo code.
How could I correct my code and make it work?
Another more efficient approach is highly welcomed.
Various techniques spring to mind, none of which entails tokenizing.
local count = 1
foreach file of local filelist {
import excel "`file'",clear
sxpose, clear
if `count' == 1 save alldata
else append using alldata
local ++count
}
local allothers "*"
foreach file of local filelist {
import excel "`file'",clear
sxpose, clear
`firstonly' save alldata
`allothers' append using alldata
local firstonly "*"
local allothers
}
In the second block, the point is that lines prefixed by *
are treated as comments, so any command that *
precedes is ignored ("commented out"). The append
statement is commented out first time round the loop and the save
statement is preceded by an undefined local macro, which evaluates to an empty string, so it is not ignored.
After the first time round the loop, commenting out on append
is removed, but placed on the save
.
I don't think either of these approaches is more efficient than what you have in mind (works faster, uses less memory, is shorter, or whatever "efficient" means for you). The code clearly does presuppose that you have set up the file list correctly.
来源:https://stackoverflow.com/questions/34872508/how-to-tokenize-a-extended-macro-local-dir