Interpreting basic output from Vowpal Wabbit

亡梦爱人 提交于 2019-12-10 13:55:13

问题


I had a couple questions about the output from a simple run of VW. I have read around the internet and the wiki sites but am still unsure about a couple of basic things.

I ran the following on the boston housing data:

vw -d housing.vm --progress 1

where the housing.vm file is set up as (partially):

and output is (partially):

Question 1:

1) Is it correct to think about the average loss column as the following steps:

a) predict zero, so the first average loss is the squared error of the first example (with the prediction as zero)

b) build a model on example 1 and predict example 2. Average the now 2 squared losses

c) build a model on example 1-2 and predict example 3. Average the now 3 squared losses

d) ...

Do this until you hit the end of the data (assuming a single pass)

2) What is the current features columns? It appears to be the number of non-zero features + an intercept. What is shown in the example, suggests that a feature is not counted if it is zero - is that true? For instance, the second record has a value of zero for 'ZN'. Does VW really look at that numeric feature as missing??


回答1:


Your statements are basically correct. By default, VW does online learning, so in step c it takes the current model (weights) and updates it with the current example (rather than learning from all the previous examples again).

As you supposed, the current features column is the number of (non-zero) features for the current example. The intercept feature is included automatically, unless you specify --noconstant.

There is no difference between a missing feature and a feature with zero value. Both means that you won't update the corresponding weight.



来源:https://stackoverflow.com/questions/25838461/interpreting-basic-output-from-vowpal-wabbit

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!