How can I hot one encode in Matlab?

11,663

Solution 1

For speed and memory savings, you can use bsxfun combined with eq to accomplish the same thing. While your eye solution may work, your memory usage grows quadratically with the number of unique values in X.

Y = bsxfun(@eq, X(:), 1:max(X));

Or as an anonymous function if you prefer:

hotone = @(X)bsxfun(@eq, X(:), 1:max(X));

Or if you're on Octave (or MATLAB version R2016b and later) , you can take advantage of automatic broadcasting and simply do the following as suggested by @Tasos.

Y = X == 1:max(X);

Benchmark

Here is a quick benchmark showing the performance of the various answers with varying number of elements on X and varying number of unique values in X.

function benchit()

    nUnique = round(linspace(10, 1000, 10));
    nElements = round(linspace(10, 1000, 12));

    times1 = zeros(numel(nUnique), numel(nElements));
    times2 = zeros(numel(nUnique), numel(nElements));
    times3 = zeros(numel(nUnique), numel(nElements));
    times4 = zeros(numel(nUnique), numel(nElements));
    times5 = zeros(numel(nUnique), numel(nElements));

    for m = 1:numel(nUnique)
        for n = 1:numel(nElements)
            X = randi(nUnique(m), nElements(n), 1);
            times1(m,n) = timeit(@()bsxfunApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times2(m,n) = timeit(@()eyeApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times3(m,n) = timeit(@()sub2indApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times4(m,n) = timeit(@()sparseApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times5(m,n) = timeit(@()sparseFullApproach(X));
        end
    end

    colors = get(0, 'defaultaxescolororder');

    figure;

    surf(nElements, nUnique, times1 * 1000, 'FaceColor', colors(1,:), 'FaceAlpha', 0.5);
    hold on
    surf(nElements, nUnique, times2 * 1000, 'FaceColor', colors(2,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times3 * 1000, 'FaceColor', colors(3,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times4 * 1000, 'FaceColor', colors(4,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times5 * 1000, 'FaceColor', colors(5,:), 'FaceAlpha', 0.5);

    view([46.1000   34.8000])

    grid on
    xlabel('Elements')
    ylabel('Unique Values')
    zlabel('Execution Time (ms)')

    legend({'bsxfun', 'eye', 'sub2ind', 'sparse', 'full(sparse)'}, 'Location', 'Northwest')
end

function Y = bsxfunApproach(X)
    Y = bsxfun(@eq, X(:), 1:max(X));
end

function Y = eyeApproach(X)
    tmp = eye(max(X));
    Y = tmp(X, :);
end

function Y = sub2indApproach(X)
    LinearIndices = sub2ind([length(X),max(X)], [1:length(X)]', X);
    Y = zeros(length(X), max(X));
    Y(LinearIndices) = 1;
end

function Y = sparseApproach(X)
    Y = sparse(1:numel(X), X,1);
end

function Y = sparseFullApproach(X)
    Y = full(sparse(1:numel(X), X,1));
end

Results

If you need a non-sparse output bsxfun performs the best, but if you can use a sparse matrix (without conversion to a full matrix), then that is the fastest and most memory efficient option.

enter image description here

Solution 2

You can use the identity matrix and index into it using the input/labels vector, for example if the labels vector X is some random integer vector

X = randi(3,5,1)

ans =

   2
   1
   2
   3
   3

then, the following will hot one encode X

eye(max(X))(X,:)

which can be conveniently defined as a function using

hotone = @(v) eye(max(v))(v,:)

EDIT:

Although the solution above works in Octave, you have you modify it for Matlab as follows

I = eye(max(X));
I(X,:)

Solution 3

I think this is fast specially when matrix dimension grows:

Y = sparse(1:numel(X), X,1);

or

Y = full(sparse(1:numel(X), X,1));

Solution 4

Just posting the sub2ind solution too to satisfy your curiosity :)
But I like your solution better :p

>> X = [2,1,2,3,3]'
>> LinearIndices = sub2ind([length(X),3], [1:length(X)]', X);
>> tmp = zeros(length(X), 3); 
>> tmp(LinearIndices) = 1
tmp =

     0     1     0
     1     0     0
     0     1     0
     0     0     1
     0     0     1
Share:
11,663
osipov
Author by

osipov

Updated on June 04, 2022

Comments

  • osipov
    osipov almost 2 years

    Often you are given a vector of integer values representing your labels (aka classes), for example

    [2; 1; 3; 3; 2]
    

    and you would like to hot one encode this vector, such that each value is represented by a 1 in the column indicated by the value in each row of the labels vector, for example

    [0 1 0;
     1 0 0;
     0 0 1;
     0 0 1;
     0 1 0]
    
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    Just pointing out chained operations don't work in matlab, so you'll have to split your one-liner. Nice solution though.
  • osipov
    osipov over 7 years
    @TasosPapastylianou thanks...I originally wrote it in Octave. Just edited the answer with unchained operations.
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    why is that? he's just doing a simple indexing operation
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    oh, ok, you're talking about the size of the eye matrix.
  • osipov
    osipov over 7 years
    @Suever memory usage is definitely important to keep in mind. In case of machine learning problems the cardinality of the labels set is usually manageable so this approach for "hot one encoding" is an easy way to get started. A more optimal solution would be to do some bit twiddling outside of Matlab/Octave.
  • osipov
    osipov over 7 years
    @suever i think you meant that the memory usage grows quadratically since eye(max(X)) takes max(X)^2 memory. Exponential memory usage would be c^max(X)
  • Suever
    Suever over 7 years
    @osipov Yes, you're right. Updated. I've also added a benchmark showing the relative performance of the techniques.
  • Suever
    Suever over 7 years
    @TasosPapastylianou yes, I was referring to eye being unnecessarily huge.
  • Suever
    Suever over 7 years
    @osipov Well in MATLAB it's best to design things that scale well from the beginning as things like this can get out of hand rather quickly (see the benchmark), there's really no sense in not using an optimal solution if you have one at your disposal. Also as the benchmark shows, it's both memory usage and execution time.
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    hm, interesting. in Octave the bottom two are a lot closer than in matlab. I wonder why. Is octave more efficient with sub2ind, or less efficient with bsxfun? :p
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    also, on Octave, you can broadcast directly (i.e. X == 1:max(X)) which seems even faster than bsxfun.
  • Tasos Papastylianou
    Tasos Papastylianou over 7 years
    isn't matlab supposed to support broadcasting sometime soon too btw? Was it in 2016b?
  • rahnema1
    rahnema1 over 7 years
    @Suever please add my answer to the benchmark
  • Suever
    Suever over 7 years
    @rahnema1 Your code does not run on the data.
  • rahnema1
    rahnema1 over 7 years
    @Suever answer edited!
  • Suever
    Suever over 7 years
    @rahnema1 Added to the benchmark