Implementing and ploting a perceptron in MATLAB
Solution 1
You should first understand what is the meaning of each of the inputs:
X
is the input matrix of examples, of size M x N, where M is the dimension of the feature vector, and N the number of samples. Since the perceptron model for prediction isY=w*X+b
, you have to supply one extra dimension inX
which is constant, usually set to1
, so theb
term is "built-in" intoX
. In the example below forX
, I set the last entry ofX
to be1
in all samples.Y
is the correct classification for each sample fromX
(the classification you want the perceptron to learn), so it should be a N dimensional row vector - one output for each input example. Since the perceptron is a binary classifier, it should have only 2 distinct possible values. Looking in the code, you see that it checks for the sign of the prediction, which tells you that the allowed values ofY
should be-1,+1
(and not0,1
for example).w
is the weight vector you are trying to learn.
So, try to call the function with:
X=[0 0; 0 1; 1 1];
Y=[1 -1];
w=[.5; .5; .5];
EDIT
Use the following code to call the perceptron alg and see the results graphically:
% input samples
X1=[rand(1,100);rand(1,100);ones(1,100)]; % class '+1'
X2=[rand(1,100);1+rand(1,100);ones(1,100)]; % class '-1'
X=[X1,X2];
% output class [-1,+1];
Y=[-ones(1,100),ones(1,100)];
% init weigth vector
w=[.5 .5 .5]';
% call perceptron
wtag=perceptron(X,Y,w);
% predict
ytag=wtag'*X;
% plot prediction over origianl data
figure;hold on
plot(X1(1,:),X1(2,:),'b.')
plot(X2(1,:),X2(2,:),'r.')
plot(X(1,ytag<0),X(2,ytag<0),'bo')
plot(X(1,ytag>0),X(2,ytag>0),'ro')
legend('class -1','class +1','pred -1','pred +1')
Solution 2
If you are interested, here is a little perceptron demo written in quite a tutorial manner:
function perceptronDemo
%PERCEPTRONDEMO
%
% A simple demonstration of the perceptron algorithm for training
% a linear classifier, made as readable as possible for tutorial
% purposes. It is derived from the treatment of linear learning
% machines presented in Chapter 2 of "An Introduction to Support
% Vector Machines" by Nello Cristianini and John Shawe-Taylor.
%
%
Data = createTrainingData;
Model = trainPerceptron( Data );
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Model = trainPerceptron( Data )
%TRAINPERCEPTRON
DOWN = 1;
ACROSS = 2;
assert( isequal( unique( Data.labels ), [-1; +1] ), ...
'Labels must be -1 or +1' );
% ---------------------------------------------------------------------
% Normalise the data by calculating z-scores
%
% This makes plotting easier, but is not needed by the algorithm.
%
sampleMean = mean( Data.samples );
sampleStdDev = std( Data.samples );
Data.samples = bsxfun( @minus, Data.samples, sampleMean );
Data.samples = bsxfun( @rdivide, Data.samples, sampleStdDev );
% ---------------------------------------------------------------------
% Calculate the squared radius of the smallest ball that encloses the
% data and is centred on the origin. This is used to provide an
% appropriate range and step size when updating the threshold (bias)
% parameter.
%
sampleSize = size( Data.samples, DOWN );
maxNorm = realmin;
for iObservation = 1:sampleSize
observationNorm = norm( Data.samples(iObservation,:) );
if observationNorm > maxNorm
maxNorm = observationNorm;
end
end
enclosingBallRadius = maxNorm;
enclosingBallRadiusSquared = enclosingBallRadius .^ 2;
% ---------------------------------------------------------------------
% Define the starting weight vector and bias. These should be zeros,
% as the algorithm omits a learning rate, and it is suggested in
% Cristianini & Shawe-Taylor that learning rate may only be omitted
% safely when the starting weight vector and bias are zero.
%
Model.weights = [0.0 0.0];
Model.bias = 0.0;
% ---------------------------------------------------------------------
% Run the perceptron training algorithm
%
% To prevent program running forever when nonseparable data are
% provided, limit the number of steps in the outer loop.
%
maxNumSteps = 1000;
for iStep = 1:maxNumSteps
isAnyObsMisclassified = false;
for iObservation = 1:sampleSize;
inputObservation = Data.samples( iObservation, : );
desiredLabel = Data.labels( iObservation ); % +1 or -1
perceptronOutput = sum( Model.weights .* inputObservation, ACROSS ) + Model.bias;
margin = desiredLabel * perceptronOutput;
isCorrectLabel = margin > 0;
% -------------------------------------------------------------
% If the model misclassifies the observation, update the
% weights and the bias.
%
if ~isCorrectLabel
isAnyObsMisclassified = true;
weightCorrection = desiredLabel * inputObservation;
Model.weights = Model.weights + weightCorrection;
biasCorrection = desiredLabel .* enclosingBallRadiusSquared;
Model.bias = Model.bias + biasCorrection;
displayPerceptronState( Data, Model );
end % if this observation misclassified.
end % loop over observations
if ~isAnyObsMisclassified
disp( 'Done!' );
break;
end
end % outer loop
end % TRAINPERCEPTRON
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function Data = createTrainingData
%CREATETRAININGDATA
%
% Return a structure containing training data suitable for linear
% classification.
%
sampleAsize = 1024;
sampleBsize = 1024;
sampleAmean = [ 5.5 5.0 ];
sampleAstdDev = [ 0.5 1.0 ];
sampleBmean = [ 2.5 3.0 ];
sampleBstdDev = [ 0.3 0.7 ];
Data.samples = [ normallyDistributedSample( sampleAsize, sampleAmean, sampleAstdDev ); ...
normallyDistributedSample( sampleBsize, sampleBmean, sampleBstdDev ) ];
Data.labels = [ ones(sampleAsize,1); ...
-ones(sampleBsize,1) ];
% ---------------------------------------------------------------------
% Randomly permute samples & class labels.
%
% This is not really necessary, but done to illustrate that the order
% in which observations are evaluated does not matter.
%
randomOrder = randperm( sampleAsize + sampleBsize );
Data.samples = Data.samples( randomOrder, : );
Data.labels = Data.labels( randomOrder, : );
end % CREATETRAININGDATA
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function samples = normallyDistributedSample( sampleSize, sampleMean, sampleStdDev )
%NORMALDISTRIBUTIONSAMPLE
%
% Draw a sample from a normal distribution with specified mean and
% standard deviation.
%
assert( isequal( size( sampleMean ), size( sampleStdDev ) ) ...
&& 1 == size( sampleMean, 1 ), ...
'Sample mean and standard deviation must be row vectors of equal length.' );
numFeatures = numel( sampleMean );
samples = randn( sampleSize, numFeatures );
samples = bsxfun( @times, samples, sampleStdDev );
samples = bsxfun( @plus, samples, sampleMean );
end % NORMALDISTRIBUTIONSAMPLE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function displayPerceptronState( Data, Model )
%DISPLAYPERCEPTRONSTATE
hFig = figure( 1 );
clf;
set( hFig, ...
'NumberTitle', 'off', ...
'Name', mfilename, ...
'MenuBar', 'none', ...
'Color', [1.0 1.0 1.0] );
displayXmin = -4;
displayXmax = 4;
displayYmin = -4;
displayYmax = 4;
hAx = subplot( 1, 1, 1 );
axis('equal');
set( hAx, ...
'Box', 'on', ...
'NextPlot', 'add', ...
'xgrid', 'on', ...
'ygrid', 'on', ...
'xlim', [displayXmin displayXmax], ... % Bounds suitable for Z-scored data
'ylim', [displayYmin displayYmax] );
xlabel( 'x_1' );
ylabel( 'x_2' );
% ---------------------------------------------------------------------
% Plot data points from the two classes
%
isPositiveClass = Data.labels > 0;
isNegativeClass = Data.labels <= 0;
plot( hAx, Data.samples(isPositiveClass,1), Data.samples(isPositiveClass,2), 'b+' );
plot( hAx, Data.samples(isNegativeClass,1), Data.samples(isNegativeClass,2), 'rx' );
% ---------------------------------------------------------------------
% Display parameters for separating hyperplane in title
%
xWeight = Model.weights(1);
yWeight = Model.weights(2);
bias = Model.bias;
szTitle = sprintf( 'Linear classifier parameters: %0.2f x_1 + %0.2f x_2 + %0.2f = 0', xWeight, yWeight, bias );
title( szTitle );
% ---------------------------------------------------------------------
% Plot separating hyperplane
%
y1 = ( (xWeight*displayXmin) + bias ) ./ -yWeight;
y2 = ( (xWeight*displayXmax) + bias ) ./ -yWeight;
plot( hAx, [displayXmin; displayXmax], [y1, y2], 'k-', 'linewidth', 2 );
pause(0.1);
end % DISPLAYPERCEPTRONSTATE
Solution 3
try this:
perceptron([1 2 1 2], [1 0 1 0], 0.5);
edgarmtze
Updated on July 28, 2022Comments
-
edgarmtze over 1 year
I´m reviewing a code from Toronto perceptron MATLAB code
The code is
function [w] = perceptron(X,Y,w_init) w = w_init; for iteration = 1 : 100 %<- in practice, use some stopping criterion! for ii = 1 : size(X,2) %cycle through training set if sign(w'*X(:,ii)) ~= Y(ii) %wrong decision? w = w + X(:,ii) * Y(ii); %then add (or subtract) this point to w end end sum(sign(w'*X)~=Y)/size(X,2) %show misclassification rate end
So I was reading how to apply this function to data matrix X, and target Y, but, do not know how to use this function, I understand, it returns a vector of weights, so it can classify.
Could you please give an example, and explain it??
I´ve tried
X=[0 0; 0 1; 1 1] Y=[1 0; 2 1] w=[1 1 1] Result = perceptron( X, Y, w ) ??? Error using ==> mtimes Inner matrix dimensions must agree. Error in ==> perceptron at 15 if sign(w'*X(:,ii)) ~= Y(ii) Result = perceptron( X, Y, w' ) ??? Error using ==> ne Matrix dimensions must agree. Error in ==> perceptron at 19 sum(sign(w'*X)~=Y) / size(X,2);
Thanks
Thank you for the anwers, I got one more, If I change the Y = [0, 1], what happens to the algorithm?.
So, Any input data will not work with Y = [0,1] with this code of the perceptron right?,
-----------------------------EDIT------------------------
One more question, if I want to plot the line that divides the 2 classes, I know we can get that the line solving linear equation system that has to do with weights, but how, what could I do?, I'm trying something like
% the initial weights w_init = [ 1 1 1]'; % the weights returned from perceptron wtag = perceptron(X,Y,w_init,15); % concatenate both Line = [wtag,w_init] % solve the linear system, am I correct doing this? rref(Line') % plot???
-
Itamar Katz about 13 yearsYour example won't work, since the algorithm assumes output values of [-1,+1], not [0,1]. The
w
vector won't be updated at all. -
Itamar Katz about 13 yearsAlso, the input should be at least of dimension 2, otherwise you explicitly assuming that
b=0
iny=a*x+b
-
edgarmtze about 13 yearsThank you very much, i really understand your example, but have one more question: What would you if the class 1 has more examples than class 0?? in the example you provided there are the same number of examples for both classes, X1 and X2
-
edgarmtze about 13 yearsIs this correct, I can not test it right now: X1=[rand(1,100);rand(1,100);ones(1,100)]; % class '+1' X2=[rand(1,300);1+rand(1,300);ones(1,300)]; % class '-1' X=[X1,X2]; % output class [-1,+1]; Y=[-ones(1,100),ones(1,300)]; % init weigth vector w=[.5 .5 .5]'; wtag=perceptron(X,Y,w);
-
Itamar Katz about 13 yearsThe number of examples is irrelevant, I just chose equal number for each class because of convenience. Nothing has to Change. The size of
w
should be the dimension of the samples (including the constant term), since any prediction is based on the value of the dot productw*x
- sow
andx
should have the same size. -
Itamar Katz about 13 yearsYes, this seems right. In addition, it's only a synthetic example. In a real-world example your data is mixed, and you are not concerned with the size of each class - only ensure you have enough examples from each class to cover the sample space.
-
Predictor about 13 yearsYou are correct: the balance of the classes should not matter. The perceptron has issues with linear separability and stability. The prior probabilities of the classes matters in the calculation of other linear tools, like linear discriminant analysis.
-
edgarmtze about 13 yearsOne question, If you change Y to [0, 1] what could happen??
-
Itamar Katz about 13 yearsIf Y is [0,1] then sign(Y) is always the same (at least in Matlab), and the update rule for
w
won't work, since sign(Y) determines the class for each sample. -
edgarmtze about 13 yearsSo, Any input data will not work with Y = [0,1] with this code of the perceptron right?,