Matlab array of struct : Fast assignment

18,456

Solution 1

You can try using the Matlab function deal, but I found it requires to tweak the input a little (using this question: In Matlab, for a multiple input function, how to use a single input as multiple inputs?), maybe there is something simpler.

n=100000;
edges(n)=struct('weight',1.0);
m=mat2cell(rand(n,1),ones(n,1),1);
[edges(:).weight]=deal(m{:});

Also I found that this is not nearly as fast as the for loop on my computer (~0.35s for deal versus ~0.05s for the loop) presumably because of the call to mat2cell. The difference in speed is reduced if you use this more than once but it stays in favor of the for loop.

Solution 2

This is much faster than deal or a loop (at least on my system):

N=10000;
edge(N) = struct('weight',1.0); % initialize the array
values = rand(1,N);  % set the values as a vector

W = mat2cell(values, 1,ones(1,N)); % convert values to a cell
[edge(:).weight] = W{:};

Using curly braces on the right gives a comma separated value list of all the values in W (i.e. N outputs) and using square braces on the right assigns those N outputs to the N values in edge(:).weight.

Solution 3

You could simply write:

edges = struct('weight', num2cell(rand(1000000,1)));

Solution 4

Is there something requiring you to particularly use a struct in this way?

Consider replacing your array of structs with simply a separate array for each member of the struct.

weights = rand(1, 1000);

If you have a struct member which is an array, you can make an extra dimension:

matrices = rand(3, 3, 1000);

If you just want to keep things neat, you could put these arrays into a struct:

edges.weights = weights;
edges.matrices = matrices;

But if you need to keep an array of structs, I think you can do

[edges.weight] = rand(1, 1000);

Solution 5

The reason that the structs in your example don't get initialized properly is that the syntax you're using only addresses the very last element in the struct array. For a nonexistent array, the rest of them get implicitly filled in with structs that have the default value [] in all their fields.

To make this behavior clear, try doing a short array with clear edges; edges(1:3) = struct('weight',1.0) and looking at each of edges(1), edges(2), and edges(3). The edges(3) element has 1.0 in its weight like you want; the others have [].

The syntax for efficiently initializing an array of structs is one of these.

% Using repmat and full assignment
edges = repmat(struct('weight', 1.0), [1 1000]);

% Using indexing
% NOTE: Only correct if variable is uninitialized!!!
edges(1:1000) = struct('weight', 1.0);  % QUESTIONABLE

Note the 1:1000 instead of just 1000 when indexing in to the uninitialized edges array.

There's a problem with the edges(1:1000) form: if edges is already initialized, this syntax will just update the values of selected elements. If edges has more than 1000 elements, the others will be left unchanged, and your code will be buggy. Or if edges is a different type, you could get an error or weird behavior depending on its existing datatype. To be safe, you need to do clear edges before initializing using the indexing syntax. So it's better to just do full assignment with the repmat form.

BUT: Regardless of how you initialize it, an array-of-structs like this is always going to be inherently slow to work with for larger data sets. You can't do real "vectorized" operations on it because your primitive arrays are all broken up in to separate mxArrays inside each struct element. That includes the field assignment in your question – it is not possible to vectorize that. Instead, you should switch a struct-of-arrays like Brian L's answer suggests.

Share:
18,456
sumodds
Author by

sumodds

Updated on June 05, 2022

Comments

  • sumodds
    sumodds almost 2 years

    Is there any way to "vector" assign an array of struct.

    Currently I can

    edges(1000000) = struct('weight',1.0); //This really does not assign the value, I checked on 2009A.
    for i=1:1000000; edges(i).weight=1.0; end; 
    

    But that is slow, I want to do something more like

    edges(:).weight=[rand(1000000,1)]; //with or without the square brackets. 
    

    Any ideas/suggestions to vectorize this assignment, so that it will be faster.

    Thanks in advance.

  • sumodds
    sumodds over 12 years
    Both of them does the same. But, I think I need it to be array of structs (meaning objects of array) and not struct of arrays (single big struct of a large array). What is the difference between the two in MATLAB, is there any ? Meaning w.r.t allocation of memory and if so, what is its implication ?
  • sumodds
    sumodds over 12 years
    These are my times. On Octave : .17s for 100K and 1.57s for 1mil for this method and it takes for ever if I use for loop, like 230s for 100K. MATLAB 2009B (diff machine/OS): 5s/49s using above and .22s/2.2s using for loop.
  • Andrew Janke
    Andrew Janke about 10 years
    The difference is that in Matlab, an array of structs ("struct-organized") is grossly inefficient because each struct stores each of its fields in a separate array, so you can't do vectorized operations on them. A struct of arrays ("planar-organized") like Brian's will store each of its fields in primitive arrays which are contiguous in memory, and vectorized (fast) Matlab functions will work on. It is a much better structure for Matlab, and more idiomatic.
  • eacousineau
    eacousineau about 10 years
    Nice! Syntatically and pragmatically elegant! It'd be nice if Matlab syntax allowed expanding arrays into an argument sequence, something like '{values}{:}'. Tried making a function to take a cell value list, but apparently it does not like assigning to varargout in the exact same way that deal() does haha.
  • eacousineau
    eacousineau about 10 years
    Whoops, found I was using mat2cell() instead of num2cell(). Here's the function: cellexpand().
  • eacousineau
    eacousineau about 10 years
    You can use anonymous handles as well: cellexpand = @(x) x{:}; numexpand = @(x) cellexpand(num2cell(x));. An example: [a, b] = numexpand([1, 2]);. More specific example: [edge.weight] = numexpand([edge.weight] + 50);