How can I hot one encode in Matlab?

matlab neural-network octave deep-learning

11,663

Solution 1

For speed and memory savings, you can use bsxfun combined with eq to accomplish the same thing. While your eye solution may work, your memory usage grows quadratically with the number of unique values in X.

Y = bsxfun(@eq, X(:), 1:max(X));

Or as an anonymous function if you prefer:

hotone = @(X)bsxfun(@eq, X(:), 1:max(X));

Or if you're on Octave (or MATLAB version R2016b and later) , you can take advantage of automatic broadcasting and simply do the following as suggested by @Tasos.

Y = X == 1:max(X);

Benchmark

Here is a quick benchmark showing the performance of the various answers with varying number of elements on X and varying number of unique values in X.

function benchit()

    nUnique = round(linspace(10, 1000, 10));
    nElements = round(linspace(10, 1000, 12));

    times1 = zeros(numel(nUnique), numel(nElements));
    times2 = zeros(numel(nUnique), numel(nElements));
    times3 = zeros(numel(nUnique), numel(nElements));
    times4 = zeros(numel(nUnique), numel(nElements));
    times5 = zeros(numel(nUnique), numel(nElements));

    for m = 1:numel(nUnique)
        for n = 1:numel(nElements)
            X = randi(nUnique(m), nElements(n), 1);
            times1(m,n) = timeit(@()bsxfunApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times2(m,n) = timeit(@()eyeApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times3(m,n) = timeit(@()sub2indApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times4(m,n) = timeit(@()sparseApproach(X));

            X = randi(nUnique(m), nElements(n), 1);
            times5(m,n) = timeit(@()sparseFullApproach(X));
        end
    end

    colors = get(0, 'defaultaxescolororder');

    figure;

    surf(nElements, nUnique, times1 * 1000, 'FaceColor', colors(1,:), 'FaceAlpha', 0.5);
    hold on
    surf(nElements, nUnique, times2 * 1000, 'FaceColor', colors(2,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times3 * 1000, 'FaceColor', colors(3,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times4 * 1000, 'FaceColor', colors(4,:), 'FaceAlpha', 0.5);
    surf(nElements, nUnique, times5 * 1000, 'FaceColor', colors(5,:), 'FaceAlpha', 0.5);

    view([46.1000   34.8000])

    grid on
    xlabel('Elements')
    ylabel('Unique Values')
    zlabel('Execution Time (ms)')

    legend({'bsxfun', 'eye', 'sub2ind', 'sparse', 'full(sparse)'}, 'Location', 'Northwest')
end

function Y = bsxfunApproach(X)
    Y = bsxfun(@eq, X(:), 1:max(X));
end

function Y = eyeApproach(X)
    tmp = eye(max(X));
    Y = tmp(X, :);
end

function Y = sub2indApproach(X)
    LinearIndices = sub2ind([length(X),max(X)], [1:length(X)]', X);
    Y = zeros(length(X), max(X));
    Y(LinearIndices) = 1;
end

function Y = sparseApproach(X)
    Y = sparse(1:numel(X), X,1);
end

function Y = sparseFullApproach(X)
    Y = full(sparse(1:numel(X), X,1));
end

Results

If you need a non-sparse output bsxfun performs the best, but if you can use a sparse matrix (without conversion to a full matrix), then that is the fastest and most memory efficient option.

Solution 2

You can use the identity matrix and index into it using the input/labels vector, for example if the labels vector X is some random integer vector

X = randi(3,5,1)

ans =

   2
   1
   2
   3
   3

then, the following will hot one encode X

eye(max(X))(X,:)

which can be conveniently defined as a function using

hotone = @(v) eye(max(v))(v,:)

EDIT:

Although the solution above works in Octave, you have you modify it for Matlab as follows

I = eye(max(X));
I(X,:)

Solution 3

I think this is fast specially when matrix dimension grows:

Y = sparse(1:numel(X), X,1);

Y = full(sparse(1:numel(X), X,1));

Solution 4

Just posting the sub2ind solution too to satisfy your curiosity :)
But I like your solution better :p

>> X = [2,1,2,3,3]'
>> LinearIndices = sub2ind([length(X),3], [1:length(X)]', X);
>> tmp = zeros(length(X), 3); 
>> tmp(LinearIndices) = 1
tmp =

     0     1     0
     1     0     0
     0     1     0
     0     0     1
     0     0     1

View more solutions

11,663

Author by

osipov

Updated on June 04, 2022

Comments

osipov almost 2 years
Often you are given a vector of integer values representing your labels (aka classes), for example
```
[2; 1; 3; 3; 2]
```
and you would like to hot one encode this vector, such that each value is represented by a 1 in the column indicated by the value in each row of the labels vector, for example
```
[0 1 0;
 1 0 0;
 0 0 1;
 0 0 1;
 0 1 0]
```
Tasos Papastylianou over 7 years

Just pointing out chained operations don't work in matlab, so you'll have to split your one-liner. Nice solution though.
osipov over 7 years

@TasosPapastylianou thanks...I originally wrote it in Octave. Just edited the answer with unchained operations.
Tasos Papastylianou over 7 years

why is that? he's just doing a simple indexing operation
Tasos Papastylianou over 7 years

oh, ok, you're talking about the size of the eye matrix.
osipov over 7 years

@Suever memory usage is definitely important to keep in mind. In case of machine learning problems the cardinality of the labels set is usually manageable so this approach for "hot one encoding" is an easy way to get started. A more optimal solution would be to do some bit twiddling outside of Matlab/Octave.
osipov over 7 years

@suever i think you meant that the memory usage grows quadratically since eye(max(X)) takes max(X)^2 memory. Exponential memory usage would be c^max(X)
Suever over 7 years

@osipov Yes, you're right. Updated. I've also added a benchmark showing the relative performance of the techniques.
Suever over 7 years

@TasosPapastylianou yes, I was referring to eye being unnecessarily huge.
Suever over 7 years

@osipov Well in MATLAB it's best to design things that scale well from the beginning as things like this can get out of hand rather quickly (see the benchmark), there's really no sense in not using an optimal solution if you have one at your disposal. Also as the benchmark shows, it's both memory usage and execution time.
Tasos Papastylianou over 7 years

hm, interesting. in Octave the bottom two are a lot closer than in matlab. I wonder why. Is octave more efficient with sub2ind, or less efficient with bsxfun? :p
Tasos Papastylianou over 7 years

also, on Octave, you can broadcast directly (i.e. X == 1:max(X)) which seems even faster than bsxfun.
Tasos Papastylianou over 7 years

isn't matlab supposed to support broadcasting sometime soon too btw? Was it in 2016b?
rahnema1 over 7 years

@Suever please add my answer to the benchmark
Suever over 7 years

@rahnema1 Your code does not run on the data.
rahnema1 over 7 years

@Suever answer edited!
Suever over 7 years

@rahnema1 Added to the benchmark