Is there a way to send data to train a model in Vowpal Wabbit without writing it to disk?
Here\'s what I\'m trying to do. I have a relatively large dataset in csv (aroun
Vowpal Wabbit supports reading data from standard input (cat train.dat | vw), so you can open a pipe directly from R.
Daemon mode supports training. If you need incremental/contiguous learning, you can use a trick with a dummy example whose tag starts with string "save". Optionally you can specify the model filename as well:
1 save_filename|
Yet another option is to use VW as library, see an example.
Note that VW supports various feature engineering using feature namespaces.
What you may be looking for is running vw
in daemon mode.
The standard way to do this is to run vw
as a daemon:
vw -i some.model --daemon --quiet --port 26542 -p /dev/stdout
You may replace 26542
by the port of your choice.
Now you can TCP connect to the server (which can be localhost
, on port 26542
) and every request you write to the TCP socket, will be responded to on the same socket.
You can both learn (send labeled examples, which will change the model in real-time) or write queries and read back responses.
You can do it either one query+prediction at a time or many at a time. All you need is a newline char at the end of each query, exactly as you would test from a file. Order is guaranteed to be preserved.
You can also intermix requests to learn from with requests that are intended only for prediction and are not supposed to update the in memory model. The trick to achieve this is to use a zero-weight for examples you don't want to be learned from.
This example will update the model because it has a weight of 1:
label 1 'tag1| input_features...
And this one won't update the model because it has a weight of 0:
label 0 'tag2| input_features...
A bit more in the official reference is in the vowpal wabbit wiki: How to run vowpal wabbit as a daemon although note that in that main example a model is pre-learned and loaded into memory.
I am also using R to transform data and output them to VowpalWabbit. There exists RVowpalWabbit
package on CRAN which can be used to connect R with VowpalWabbit. However,
it is only available on Linux.
Also, to speed things up, I use fread
function of data.table
package. Transformations of data.table
are also quicker than in data.frame
, but one needs to learn a different syntax.