I\'m trying to write a function in R that drops columns from a data frame and returns the new data with a name specified as an argument of the function:
drop <
Use the assign()
function.
assign("new.data", my.data[,-col], envir = .GlobalEnv)
The first argument should be a string. In this case, the resultant global variable will be named "new.data". If new.data
is the name itself, drop the quotes from the function call.
<<-
does not always assign to the global environment.
In general, however, it is better to return things from a function than set global variables from inside a function. The latter is a lot harder to debug.
One reason to need this is when working a great deal with the RStudio console to perform lots of text mining. For example, if you have a large corpus and you want to break it up into sub-corpi based on themes, performing the processing as a function and returning a cleaned corpus can be much faster. An example is below:
processText <- function(inputText, corpName){
outputName <- Corpus(VectorSource(inputText))
outputName <- tm_map(outputName,PlainTextDocument)
outputName <- tm_map(outputName, removeWords, stopwords("english"))
outputName <- tm_map(outputName, removePunctuation)
outputName <- tm_map(outputName, removeNumbers)
outputName <- tm_map(outputName, stripWhitespace)
assign(corpName, outputName, envir = .GlobalEnv)
return(corpName)
}
In the case above, I enter the column from the data frame as the inputText
and the desired output corpus as corpName
. This allows the simple task of the following to process a bunch of text data:
processText(retail$Essay,"retailCorp")
Then the new corpus "retailCorp" shows up in the global environment for further work such as plotting word clouds, etc. Also, I can send lists through the function and get lots of corpi back.