I am performing logistic regression using Vowpal Wabbit on a dataset with 25 features and 48 million instances. I have a question on current predict values. Should it be within 0 or 1.
average since example example current current current
loss last counter weight label predict features
0.693147 0.693147 1 1.0 -1.0000 0.0000 24
0.419189 0.145231 2 2.0 -1.0000 -1.8559 24
0.235457 0.051725 4 4.0 -1.0000 -2.7588 23
6.371911 12.508365 8 8.0 -1.0000 -3.7784 24
3.485084 0.598258 16 16.0 -1.0000 -2.2767 24
1.765249 0.045413 32 32.0 -1.0000 -2.8924 24
1.017911 0.270573 64 64.0 -1.0000 -3.0438 25
0.611419 0.204927 128 128.0 -1.0000 -3.1539 25
0.469127 0.326834 256 256.0 -1.0000 -1.6101 23
0.403473 0.337820 512 512.0 -1.0000 -2.8843 25
0.337348 0.271222 1024 1024.0 -1.0000 -2.5209 25
0.328909 0.320471 2048 2048.0 -1.0000 -2.0732 25
0.309401 0.289892 4096 4096.0 -1.0000 -2.7639 25
0.291447 0.273492 8192 8192.0 -1.0000 -2.5978 24
0.287428 0.283409 16384 16384.0 -1.0000 -3.1774 25
0.287249 0.287071 32768 32768.0 -1.0000 -2.7770 24
0.282737 0.278224 65536 65536.0 -1.0000 -1.9070 25
0.278517 0.274297 131072 131072.0 -1.0000 -3.3813 24
0.291475 0.304433 262144 262144.0 1.0000 -2.7975 23
0.324553 0.357630 524288 524288.0 -1.0000 -0.8995 24
0.373086 0.421619 1048576 1048576.0 -1.0000 -1.2076 24
0.422605 0.472125 2097152 2097152.0 1.0000 -1.4907 25
0.476046 0.529488 4194304 4194304.0 -1.0000 -1.8591 25
0.476627 0.477208 8388608 8388608.0 -1.0000 -2.0037 23
0.446556 0.416485 16777216 16777216.0 -1.0000 -0.9915 24
0.422831 0.399107 33554432 33554432.0 -1.0000 -1.9549 25
0.428316 0.433801 67108864 67108864.0 -1.0000 -0.6376 24
0.425511 0.422705 134217728 134217728.0 -1.0000 -0.4094 24
0.425185 0.424860 268435456 268435456.0 -1.0000 -1.1529 24
0.426747 0.428309 536870912 536870912.0 -1.0000 -2.7468 25
Predictions are in the range [-50, +50] (theoretically any real number, but Vowpal Wabbit truncates it to [-50, +50]).
To convert them to {-1, +1}, use --binary
. Positive predictions are simply mapped to +1, negative to -1.
To convert them to [0, +1], use --link=logistic
.
This uses the logistic function 1/(1 + exp(-x)).
You should also use --loss_function=logistic
if you want to interpret the numbers as probabilities.
To convert them to [-1, +1], use --link=glf1
.
This uses formula 2/(1 + exp(-x)) - 1 (generalized logistic function with limits of 1).
来源:https://stackoverflow.com/questions/26833841/vowpal-wabbit-logistic-regression