问题
I'm trying to use user defined kernel. I know that kernlab offer user defined kernel(custom kernel functions) in R. I used data spam including package kernlab. (number of variables=57 number of examples =4061)
I'm defined kernel's form,
kp=function(d,e){
as=v*d
bs=v*e
cs=as-bs
cs=as.matrix(cs)
exp(-(norm(cs,"F")^2)/2)
}
class(kp)="kernel"
It is the transformed kernel for gaussian kernel, where v
is the continuously changed values that are inverse of standard deviation vector about each variables, for example:
v=(0.1666667,........0.1666667)
The training set defined 60% of spam data (preserving the proportions of the different classes).
if data's type is spam, than data's type = 1 for train svm
m=ksvm(xtrain,ytrain,type="C-svc",kernel=kp,C=10)
But this step is not working. It's always waiting for a response.
So, I ask you this problem, why? Is it because the number of examples are too big? Is there any other R package that can train SVMs for user defined kernel?
回答1:
First, your kernel looks like a classic RBF kernel, with v = 1/sigma
, so why do you use it? You can use a built-in RBF kernel and simply set the sigma
parameter. In particular - instead of using frobenius norm on matrices you could use classic euclidean on the vectorized matrices.
Second - this is working just fine.
> xtrain = as.matrix( c(1,2,3,4) )
> ytrain = as.factor( c(0,0,1,1) )
> v= 0.01
> m=ksvm(xtrain,ytrain,type="C-svc",kernel=kp,C=10)
> m
Support Vector Machine object of class "ksvm"
SV type: C-svc (classification)
parameter : cost C = 10
Number of Support Vectors : 4
Objective Function Value : -39.952
Training error : 0
There are at least two reasons for you still waiting for results:
- RBF kernels induce the most hard problem to optimize for SVM (especially for large
C
) - User defined kernels are far less efficient then builtin
As I am not sure, whether ksvm
actually optimizes the user-defined kernel computation (in fact I'm pretty sure it does not), you could try to build the kernel matrix ( K[i,j] = K(x_i,x_j)
where x_i
is i'th
training vector) and provide ksvm
with it. You can achieve this by
K <- kernelMatrix(kp,xtrain)
m <- ksvm(K,ytrain,type="C-svc",kernel='matrix',C=10)
Precomputing kernel matrix can be quite long process, but then optimization itself will be much faster, so it is a good method if you want to test many different C
values (which you for sure should do). Unfortunately this requires O(n^2)
memory, so if you use more then 100 000 vectors, you will need really great amount of RAM.
来源:https://stackoverflow.com/questions/12085454/r-svm-performance-using-custom-kernel-user-defined-kernel-is-not-working-in-k