Multi-class classification in libsvm
According to the official libsvm documentation (Section 7):
LIBSVM implements the "one-against-one" approach for multi-class classification. If
k
is the number of classes, thenk(k-1)/2
classifiers are constructed and each one trains data from two classes.In classification we use a voting strategy: each binary classification is considered to be a voting where votes can be cast for all data points x - in the end a point is designated to be in a class with the maximum number of votes.
In the one-against-all approach, we build as many binary classifiers as there are classes, each trained to separate one class from the rest. To predict a new instance, we choose the classifier with the largest decision function value.
As I mentioned before, the idea is to train k
SVM models each one separating one class from the rest. Once we have those binary classifiers, we use the probability outputs (the -b 1
option) to predict new instances by picking the class with the highest probability.
Consider the following example:
%# Fisher Iris dataset
load fisheriris
[~,~,labels] = unique(species); %# labels: 1/2/3
data = zscore(meas); %# scale features
numInst = size(data,1);
numLabels = max(labels);
%# split training/testing
idx = randperm(numInst);
numTrain = 100; numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:); testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));
Here is my implementation for the one-against-all approach for multi-class SVM:
%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end
%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
prob(:,k) = p(:,model{k}.Label==1); %# probability of class==k
end
%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel) %# accuracy
C = confusionmat(testLabel, pred) %# confusion matrix
images
Updated on February 02, 2020Comments
-
images about 4 years
I'm working with libsvm and I must implement the classification for multiclasses with one versus all.
How can I do it?
Doeslibsvm
version 2011 use this?
I think that my question is not very clear. if libsvm don't use automatically one versus all,I will use one svm for every class, else how can i defined this parameters in the
svmtrain
function. I had read README of libsvm. -
images about 12 yearscan you provide me an example for one against all with libsvm?
-
images about 12 yearsI try to use the code provided by lakesh in his questions(Multi-Class SVM( one versus all), is it correct?
-
Amro about 12 years@images: I've added a sample implementation
-
images about 12 years'@ Amro :Thank you very much for your efforts.'
-
Zahra E over 11 yearsWhen I copy your code I get the following error at the line
[~,~,labels] = unique(species);
in matlab:Expression or statement is incorrect--possibly unbalanced (, {, or [.
. Could you help me please? -
Amro over 11 years@Ezati: the
~
syntax requires R2009b. Use a dummy variable instead if you are using an older version of MATLAB -
Zahra E over 11 yearsThanks a lot, I replaced it with
[dummy,dummy,labels] = unique(species);
and it worked fine -
Zahra E over 11 yearsMay I ask you another question: stackoverflow.com/q/14024740/1071703 ...if you have time of course :)
-
Amro over 11 years@Ezati: I have posted an answer there.
-
Zahra E over 11 yearsAgain thank you for your great response :)
-
MVTC over 9 yearsCan you expliain what these parameters stand for '-c 1 -g 0.2'
-
Amro over 9 years@MVTC: you should probably read the libsvm guide first;
c
is the penalty parameter of the error term in C-SVC,g
is the RBF kernel gamma parameter. One usually use cross-validation to find the best values for these parameters, see here for an example: stackoverflow.com/a/9049225/97160 -
Arturo about 8 yearsHow would I do this for a dataset size of 50k samples and of dimensionality 4000? Matlab seems to be taking too long. *I added the -t 0 option for a linear kernel
-
Royi about 7 yearsWhat about one class SVM? How do you define the Labels vector in that case? Thank You.
-
Amin almost 7 years@Amro: when I use your example, I reach to this error:
Invalid MEX-file '...\libsvm-3.22\libsvm-3.22\matlab\svmtrain.mexw64': The specified module could not be found..
-
Amro almost 7 years@Amin: It sounds like a build problem, see this post for instructions on how to compile libsvm for MATLAB: stackoverflow.com/a/15559516/97160. If you are still having problems, consider using Dependency Walker to troubleshoot: mathworks.com/matlabcentral/answers/…
-
Amin almost 7 years@Amro: perfect.
-
Amro almost 7 years@Amin: as was explained in the above post, one-vs-one is the approach implemented in libsvm, just call
svmtrain
directly with multi-class labels... Here is another answer of mine that compares the two: stackoverflow.com/a/14042056/97160 -
Amin almost 7 years@Amro: for your reference: mathworks.com/matlabcentral/answers/…
-
Amro almost 7 years@Amin: Don't expect full answers in comments, but you can't just blindly apply machine learning algorithms, and expect good results. You should read up on SVM and its parameters (kernels, C, gamma, etc.), and how you would need to do a grid search to find good values using cross-validation. You should also look into preprocessing the data (normalizing the features at the least)... Good luck.
-
Christina over 3 years@Amro Please can you help me here? thank you a lot stackoverflow.com/questions/65449934/…