Implementing and ploting a perceptron in MATLAB

41,050

Solution 1

You should first understand what is the meaning of each of the inputs:

  • X is the input matrix of examples, of size M x N, where M is the dimension of the feature vector, and N the number of samples. Since the perceptron model for prediction is Y=w*X+b, you have to supply one extra dimension in X which is constant, usually set to 1, so the b term is "built-in" into X. In the example below for X, I set the last entry of X to be 1 in all samples.
  • Y is the correct classification for each sample from X (the classification you want the perceptron to learn), so it should be a N dimensional row vector - one output for each input example. Since the perceptron is a binary classifier, it should have only 2 distinct possible values. Looking in the code, you see that it checks for the sign of the prediction, which tells you that the allowed values of Y should be -1,+1 (and not 0,1 for example).
  • w is the weight vector you are trying to learn.

So, try to call the function with:

X=[0 0; 0 1; 1 1];
Y=[1 -1];
w=[.5; .5; .5];

EDIT

Use the following code to call the perceptron alg and see the results graphically:

% input samples
X1=[rand(1,100);rand(1,100);ones(1,100)];   % class '+1'
X2=[rand(1,100);1+rand(1,100);ones(1,100)]; % class '-1'
X=[X1,X2];

% output class [-1,+1];
Y=[-ones(1,100),ones(1,100)];

% init weigth vector
w=[.5 .5 .5]';

% call perceptron
wtag=perceptron(X,Y,w);
% predict
ytag=wtag'*X;


% plot prediction over origianl data
figure;hold on
plot(X1(1,:),X1(2,:),'b.')
plot(X2(1,:),X2(2,:),'r.')

plot(X(1,ytag<0),X(2,ytag<0),'bo')
plot(X(1,ytag>0),X(2,ytag>0),'ro')
legend('class -1','class +1','pred -1','pred +1')

Solution 2

If you are interested, here is a little perceptron demo written in quite a tutorial manner:

function perceptronDemo
%PERCEPTRONDEMO
%
%   A simple demonstration of the perceptron algorithm for training
%   a linear classifier, made as readable as possible for tutorial
%   purposes. It is derived from the treatment of linear learning
%   machines presented in Chapter 2 of "An Introduction to Support
%   Vector Machines" by Nello Cristianini and John Shawe-Taylor.
%
%

    Data  = createTrainingData;
    Model = trainPerceptron( Data );

end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Model = trainPerceptron( Data )
%TRAINPERCEPTRON

    DOWN   = 1;
    ACROSS = 2;

    assert( isequal( unique( Data.labels ), [-1; +1] ), ...
        'Labels must be -1 or +1' );

    % ---------------------------------------------------------------------
    % Normalise the data by calculating z-scores
    %
    %   This makes plotting easier, but is not needed by the algorithm.
    %

    sampleMean   = mean( Data.samples );
    sampleStdDev = std(  Data.samples );
    Data.samples = bsxfun( @minus,   Data.samples, sampleMean   );
    Data.samples = bsxfun( @rdivide, Data.samples, sampleStdDev );

    % ---------------------------------------------------------------------
    % Calculate the squared radius of the smallest ball that encloses the
    % data and is centred on the origin. This is used to provide an
    % appropriate range and step size when updating the threshold (bias)
    % parameter.
    %

    sampleSize = size( Data.samples, DOWN );
    maxNorm    = realmin;
    for iObservation = 1:sampleSize
        observationNorm = norm( Data.samples(iObservation,:) );
        if observationNorm > maxNorm
            maxNorm = observationNorm;
        end
    end
    enclosingBallRadius        = maxNorm;
    enclosingBallRadiusSquared = enclosingBallRadius .^ 2;

    % ---------------------------------------------------------------------
    % Define the starting weight vector and bias. These should be zeros,
    % as the algorithm omits a learning rate, and it is suggested in
    % Cristianini & Shawe-Taylor that learning rate may only be omitted
    % safely when the starting weight vector and bias are zero.
    %

    Model.weights = [0.0 0.0];
    Model.bias    = 0.0;

    % ---------------------------------------------------------------------
    % Run the perceptron training algorithm
    %
    %   To prevent program running forever when nonseparable data are
    %   provided, limit the number of steps in the outer loop.
    %

    maxNumSteps = 1000;

    for iStep = 1:maxNumSteps

        isAnyObsMisclassified = false;

        for iObservation = 1:sampleSize;

            inputObservation = Data.samples( iObservation, : );
            desiredLabel     = Data.labels(  iObservation    ); % +1 or -1

            perceptronOutput = sum( Model.weights .* inputObservation, ACROSS ) + Model.bias;
            margin           = desiredLabel * perceptronOutput;

            isCorrectLabel   = margin > 0;

            % -------------------------------------------------------------
            % If the model misclassifies the observation, update the
            % weights and the bias.
            %

            if ~isCorrectLabel

                isAnyObsMisclassified = true;

                weightCorrection = desiredLabel  * inputObservation;
                Model.weights    = Model.weights + weightCorrection;

                biasCorrection   = desiredLabel .* enclosingBallRadiusSquared;
                Model.bias       = Model.bias   + biasCorrection;

                displayPerceptronState( Data, Model );

            end % if this observation misclassified.

        end % loop over observations

        if ~isAnyObsMisclassified
            disp( 'Done!' );
            break;
        end

    end % outer loop

end % TRAINPERCEPTRON
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Data = createTrainingData
%CREATETRAININGDATA
%
%   Return a structure containing training data suitable for linear
%   classification.
%

    sampleAsize   = 1024;
    sampleBsize   = 1024;

    sampleAmean   = [ 5.5 5.0 ];
    sampleAstdDev = [ 0.5 1.0 ];

    sampleBmean   = [ 2.5 3.0 ];
    sampleBstdDev = [ 0.3 0.7 ];

    Data.samples  = [ normallyDistributedSample( sampleAsize, sampleAmean, sampleAstdDev ); ...
                      normallyDistributedSample( sampleBsize, sampleBmean, sampleBstdDev ) ];

    Data.labels   = [  ones(sampleAsize,1); ...
                      -ones(sampleBsize,1) ];

    % ---------------------------------------------------------------------
    % Randomly permute samples & class labels.
    %
    %   This is not really necessary, but done to illustrate that the order
    %   in which observations are evaluated does not matter.
    %

    randomOrder   = randperm( sampleAsize + sampleBsize );
    Data.samples  = Data.samples( randomOrder, : );
    Data.labels   = Data.labels(  randomOrder, : );

end % CREATETRAININGDATA
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function samples = normallyDistributedSample( sampleSize, sampleMean, sampleStdDev )
%NORMALDISTRIBUTIONSAMPLE
%
%   Draw a sample from a normal distribution with specified mean and
%   standard deviation.
%

    assert(    isequal( size( sampleMean ), size( sampleStdDev ) ) ...
            && 1 == size( sampleMean, 1 ),                         ...
        'Sample mean and standard deviation must be row vectors of equal length.' );

    numFeatures = numel( sampleMean );
    samples     = randn( sampleSize, numFeatures );
    samples     = bsxfun( @times, samples, sampleStdDev );
    samples     = bsxfun( @plus,  samples, sampleMean   );

end % NORMALDISTRIBUTIONSAMPLE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function displayPerceptronState( Data, Model )
%DISPLAYPERCEPTRONSTATE

    hFig = figure( 1 );
    clf;
    set( hFig,                        ...
        'NumberTitle', 'off',         ...
        'Name',         mfilename,    ...
        'MenuBar',      'none',       ...
        'Color',        [1.0 1.0 1.0] );

    displayXmin = -4;
    displayXmax =  4;
    displayYmin = -4;
    displayYmax =  4;

    hAx = subplot( 1, 1, 1 );
    axis('equal');
    set( hAx,                                  ...
        'Box',      'on',                      ...
        'NextPlot', 'add',                     ...
        'xgrid',    'on',                      ...
        'ygrid',    'on',                      ...
        'xlim',     [displayXmin displayXmax], ... % Bounds suitable for Z-scored data
        'ylim',     [displayYmin displayYmax]  );
    xlabel( 'x_1' );
    ylabel( 'x_2' );

    % ---------------------------------------------------------------------
    % Plot data points from the two classes
    %

    isPositiveClass = Data.labels >  0;
    isNegativeClass = Data.labels <= 0;

    plot( hAx, Data.samples(isPositiveClass,1), Data.samples(isPositiveClass,2), 'b+' );
    plot( hAx, Data.samples(isNegativeClass,1), Data.samples(isNegativeClass,2), 'rx' );

    % ---------------------------------------------------------------------
    % Display parameters for separating hyperplane in title
    %

    xWeight   = Model.weights(1);
    yWeight   = Model.weights(2);
    bias      = Model.bias;

    szTitle   = sprintf( 'Linear classifier parameters: %0.2f x_1 + %0.2f x_2 + %0.2f = 0', xWeight, yWeight, bias );
    title( szTitle );

    % ---------------------------------------------------------------------
    % Plot separating hyperplane
    %

    y1 = ( (xWeight*displayXmin) + bias ) ./ -yWeight;
    y2 = ( (xWeight*displayXmax) + bias ) ./ -yWeight;

    plot( hAx, [displayXmin; displayXmax], [y1, y2], 'k-', 'linewidth', 2 );

    pause(0.1);

end % DISPLAYPERCEPTRONSTATE

Solution 3

try this:

perceptron([1 2 1 2], [1 0 1 0], 0.5);
Share:
41,050
edgarmtze
Author by

edgarmtze

Updated on July 28, 2022

Comments

  • edgarmtze
    edgarmtze over 1 year

    I´m reviewing a code from Toronto perceptron MATLAB code

    The code is

    function [w] = perceptron(X,Y,w_init)
    
    w = w_init;
    for iteration = 1 : 100  %<- in practice, use some stopping criterion!
      for ii = 1 : size(X,2)         %cycle through training set
        if sign(w'*X(:,ii)) ~= Y(ii) %wrong decision?
          w = w + X(:,ii) * Y(ii);   %then add (or subtract) this point to w
        end
      end
      sum(sign(w'*X)~=Y)/size(X,2)   %show misclassification rate
    end
    

    So I was reading how to apply this function to data matrix X, and target Y, but, do not know how to use this function, I understand, it returns a vector of weights, so it can classify.

    Could you please give an example, and explain it??

    I´ve tried

    X=[0 0; 0 1; 1 1]
    Y=[1 0; 2 1]
    w=[1 1 1]
    Result = perceptron( X, Y, w )
    
    ??? Error using ==> mtimes
    Inner matrix dimensions must agree.
    
    Error in ==> perceptron at 15
                if sign(w'*X(:,ii)) ~= Y(ii) 
    
        Result = perceptron( X, Y, w' )
    
    ??? Error using ==> ne
    Matrix dimensions must agree.
    
    Error in ==> perceptron at 19
            sum(sign(w'*X)~=Y) / size(X,2);     
    

    Thanks

    Thank you for the anwers, I got one more, If I change the Y = [0, 1], what happens to the algorithm?.

    So, Any input data will not work with Y = [0,1] with this code of the perceptron right?,

    -----------------------------EDIT------------------------

    One more question, if I want to plot the line that divides the 2 classes, I know we can get that the line solving linear equation system that has to do with weights, but how, what could I do?, I'm trying something like

    % the initial weights
    w_init = [ 1 1 1]';  
    % the weights returned from perceptron    
    wtag   = perceptron(X,Y,w_init,15);
    
    % concatenate both
    Line = [wtag,w_init] 
    
    % solve the linear system, am I correct doing this?
    rref(Line')
    
    % plot???
    
  • Itamar Katz
    Itamar Katz about 13 years
    Your example won't work, since the algorithm assumes output values of [-1,+1], not [0,1]. The w vector won't be updated at all.
  • Itamar Katz
    Itamar Katz about 13 years
    Also, the input should be at least of dimension 2, otherwise you explicitly assuming that b=0 in y=a*x+b
  • edgarmtze
    edgarmtze about 13 years
    Thank you very much, i really understand your example, but have one more question: What would you if the class 1 has more examples than class 0?? in the example you provided there are the same number of examples for both classes, X1 and X2
  • edgarmtze
    edgarmtze about 13 years
    Is this correct, I can not test it right now: X1=[rand(1,100);rand(1,100);ones(1,100)]; % class '+1' X2=[rand(1,300);1+rand(1,300);ones(1,300)]; % class '-1' X=[X1,X2]; % output class [-1,+1]; Y=[-ones(1,100),ones(1,300)]; % init weigth vector w=[.5 .5 .5]'; wtag=perceptron(X,Y,w);
  • Itamar Katz
    Itamar Katz about 13 years
    The number of examples is irrelevant, I just chose equal number for each class because of convenience. Nothing has to Change. The size of w should be the dimension of the samples (including the constant term), since any prediction is based on the value of the dot product w*x - so w and x should have the same size.
  • Itamar Katz
    Itamar Katz about 13 years
    Yes, this seems right. In addition, it's only a synthetic example. In a real-world example your data is mixed, and you are not concerned with the size of each class - only ensure you have enough examples from each class to cover the sample space.
  • Predictor
    Predictor about 13 years
    You are correct: the balance of the classes should not matter. The perceptron has issues with linear separability and stability. The prior probabilities of the classes matters in the calculation of other linear tools, like linear discriminant analysis.
  • edgarmtze
    edgarmtze about 13 years
    One question, If you change Y to [0, 1] what could happen??
  • Itamar Katz
    Itamar Katz about 13 years
    If Y is [0,1] then sign(Y) is always the same (at least in Matlab), and the update rule for w won't work, since sign(Y) determines the class for each sample.
  • edgarmtze
    edgarmtze about 13 years
    So, Any input data will not work with Y = [0,1] with this code of the perceptron right?,