Multi-class classification in libsvm

36,706

According to the official libsvm documentation (Section 7):

LIBSVM implements the "one-against-one" approach for multi-class classification. If k is the number of classes, then k(k-1)/2 classifiers are constructed and each one trains data from two classes.

In classification we use a voting strategy: each binary classification is considered to be a voting where votes can be cast for all data points x - in the end a point is designated to be in a class with the maximum number of votes.

In the one-against-all approach, we build as many binary classifiers as there are classes, each trained to separate one class from the rest. To predict a new instance, we choose the classifier with the largest decision function value.


As I mentioned before, the idea is to train k SVM models each one separating one class from the rest. Once we have those binary classifiers, we use the probability outputs (the -b 1 option) to predict new instances by picking the class with the highest probability.

Consider the following example:

%# Fisher Iris dataset
load fisheriris
[~,~,labels] = unique(species);   %# labels: 1/2/3
data = zscore(meas);              %# scale features
numInst = size(data,1);
numLabels = max(labels);

%# split training/testing
idx = randperm(numInst);
numTrain = 100; numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:);  testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));

Here is my implementation for the one-against-all approach for multi-class SVM:

%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end

%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    prob(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end

%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel)    %# accuracy
C = confusionmat(testLabel, pred)                   %# confusion matrix
Share:
36,706
images
Author by

images

Updated on February 02, 2020

Comments

  • images
    images about 4 years

    I'm working with libsvm and I must implement the classification for multiclasses with one versus all.

    How can I do it?
    Does libsvm version 2011 use this?


    I think that my question is not very clear. if libsvm don't use automatically one versus all,I will use one svm for every class, else how can i defined this parameters in the svmtrain function. I had read README of libsvm.

  • images
    images about 12 years
    can you provide me an example for one against all with libsvm?
  • images
    images about 12 years
    I try to use the code provided by lakesh in his questions(Multi-Class SVM( one versus all), is it correct?
  • Amro
    Amro about 12 years
    @images: I've added a sample implementation
  • images
    images about 12 years
    '@ Amro :Thank you very much for your efforts.'
  • Zahra E
    Zahra E over 11 years
    When I copy your code I get the following error at the line [~,~,labels] = unique(species); in matlab: Expression or statement is incorrect--possibly unbalanced (, {, or [.. Could you help me please?
  • Amro
    Amro over 11 years
    @Ezati: the ~ syntax requires R2009b. Use a dummy variable instead if you are using an older version of MATLAB
  • Zahra E
    Zahra E over 11 years
    Thanks a lot, I replaced it with [dummy,dummy,labels] = unique(species); and it worked fine
  • Zahra E
    Zahra E over 11 years
    May I ask you another question: stackoverflow.com/q/14024740/1071703 ...if you have time of course :)
  • Amro
    Amro over 11 years
    @Ezati: I have posted an answer there.
  • Zahra E
    Zahra E over 11 years
    Again thank you for your great response :)
  • MVTC
    MVTC over 9 years
    Can you expliain what these parameters stand for '-c 1 -g 0.2'
  • Amro
    Amro over 9 years
    @MVTC: you should probably read the libsvm guide first; c is the penalty parameter of the error term in C-SVC, g is the RBF kernel gamma parameter. One usually use cross-validation to find the best values for these parameters, see here for an example: stackoverflow.com/a/9049225/97160
  • Arturo
    Arturo about 8 years
    How would I do this for a dataset size of 50k samples and of dimensionality 4000? Matlab seems to be taking too long. *I added the -t 0 option for a linear kernel
  • Royi
    Royi about 7 years
    What about one class SVM? How do you define the Labels vector in that case? Thank You.
  • Amin
    Amin almost 7 years
    @Amro: when I use your example, I reach to this error: Invalid MEX-file '...\libsvm-3.22\libsvm-3.22\matlab\svmtrain.mexw64': The specified module could not be found..
  • Amro
    Amro almost 7 years
    @Amin: It sounds like a build problem, see this post for instructions on how to compile libsvm for MATLAB: stackoverflow.com/a/15559516/97160. If you are still having problems, consider using Dependency Walker to troubleshoot: mathworks.com/matlabcentral/answers/…
  • Amin
    Amin almost 7 years
    @Amro: perfect.
  • Amro
    Amro almost 7 years
    @Amin: as was explained in the above post, one-vs-one is the approach implemented in libsvm, just call svmtrain directly with multi-class labels... Here is another answer of mine that compares the two: stackoverflow.com/a/14042056/97160
  • Amin
    Amin almost 7 years
    @Amro: for your reference: mathworks.com/matlabcentral/answers/…
  • Amro
    Amro almost 7 years
    @Amin: Don't expect full answers in comments, but you can't just blindly apply machine learning algorithms, and expect good results. You should read up on SVM and its parameters (kernels, C, gamma, etc.), and how you would need to do a grid search to find good values using cross-validation. You should also look into preprocessing the data (normalizing the features at the least)... Good luck.
  • Christina
    Christina over 3 years
    @Amro Please can you help me here? thank you a lot stackoverflow.com/questions/65449934/…