Matlab array of struct : Fast assignment
Solution 1
You can try using the Matlab function deal
, but I found it requires to tweak the input a little (using this question: In Matlab, for a multiple input function, how to use a single input as multiple inputs?), maybe there is something simpler.
n=100000;
edges(n)=struct('weight',1.0);
m=mat2cell(rand(n,1),ones(n,1),1);
[edges(:).weight]=deal(m{:});
Also I found that this is not nearly as fast as the for loop on my computer (~0.35s for deal versus ~0.05s for the loop) presumably because of the call to mat2cell
. The difference in speed is reduced if you use this more than once but it stays in favor of the for loop.
Solution 2
This is much faster than deal or a loop (at least on my system):
N=10000;
edge(N) = struct('weight',1.0); % initialize the array
values = rand(1,N); % set the values as a vector
W = mat2cell(values, 1,ones(1,N)); % convert values to a cell
[edge(:).weight] = W{:};
Using curly braces on the right gives a comma separated value list of all the values in W (i.e. N outputs) and using square braces on the right assigns those N outputs to the N values in edge(:).weight.
Solution 3
You could simply write:
edges = struct('weight', num2cell(rand(1000000,1)));
Solution 4
Is there something requiring you to particularly use a struct in this way?
Consider replacing your array of structs with simply a separate array for each member of the struct.
weights = rand(1, 1000);
If you have a struct member which is an array, you can make an extra dimension:
matrices = rand(3, 3, 1000);
If you just want to keep things neat, you could put these arrays into a struct:
edges.weights = weights;
edges.matrices = matrices;
But if you need to keep an array of structs, I think you can do
[edges.weight] = rand(1, 1000);
Solution 5
The reason that the structs in your example don't get initialized properly is that the syntax you're using only addresses the very last element in the struct array. For a nonexistent array, the rest of them get implicitly filled in with structs that have the default value []
in all their fields.
To make this behavior clear, try doing a short array with clear edges; edges(1:3) = struct('weight',1.0)
and looking at each of edges(1)
, edges(2)
, and edges(3)
. The edges(3)
element has 1.0
in its weight like you want; the others have []
.
The syntax for efficiently initializing an array of structs is one of these.
% Using repmat and full assignment
edges = repmat(struct('weight', 1.0), [1 1000]);
% Using indexing
% NOTE: Only correct if variable is uninitialized!!!
edges(1:1000) = struct('weight', 1.0); % QUESTIONABLE
Note the 1:1000
instead of just 1000
when indexing in to the uninitialized edges array.
There's a problem with the edges(1:1000)
form: if edges
is already initialized, this syntax will just update the values of selected elements. If edges has more than 1000 elements, the others will be left unchanged, and your code will be buggy. Or if edges
is a different type, you could get an error or weird behavior depending on its existing datatype. To be safe, you need to do clear edges
before initializing using the indexing syntax. So it's better to just do full assignment with the repmat
form.
BUT: Regardless of how you initialize it, an array-of-structs like this is always going to be inherently slow to work with for larger data sets. You can't do real "vectorized" operations on it because your primitive arrays are all broken up in to separate mxArrays inside each struct element. That includes the field assignment in your question – it is not possible to vectorize that. Instead, you should switch a struct-of-arrays like Brian L's answer suggests.
sumodds
Updated on June 05, 2022Comments
-
sumodds almost 2 years
Is there any way to "vector" assign an array of struct.
Currently I can
edges(1000000) = struct('weight',1.0); //This really does not assign the value, I checked on 2009A. for i=1:1000000; edges(i).weight=1.0; end;
But that is slow, I want to do something more like
edges(:).weight=[rand(1000000,1)]; //with or without the square brackets.
Any ideas/suggestions to vectorize this assignment, so that it will be faster.
Thanks in advance.
-
sumodds over 12 yearsBoth of them does the same. But, I think I need it to be array of structs (meaning objects of array) and not struct of arrays (single big struct of a large array). What is the difference between the two in MATLAB, is there any ? Meaning w.r.t allocation of memory and if so, what is its implication ?
-
sumodds over 12 yearsThese are my times. On Octave : .17s for 100K and 1.57s for 1mil for this method and it takes for ever if I use for loop, like 230s for 100K. MATLAB 2009B (diff machine/OS): 5s/49s using above and .22s/2.2s using for loop.
-
Andrew Janke about 10 yearsThe difference is that in Matlab, an array of structs ("struct-organized") is grossly inefficient because each struct stores each of its fields in a separate array, so you can't do vectorized operations on them. A struct of arrays ("planar-organized") like Brian's will store each of its fields in primitive arrays which are contiguous in memory, and vectorized (fast) Matlab functions will work on. It is a much better structure for Matlab, and more idiomatic.
-
eacousineau about 10 yearsNice! Syntatically and pragmatically elegant! It'd be nice if Matlab syntax allowed expanding arrays into an argument sequence, something like '{values}{:}'. Tried making a function to take a cell value list, but apparently it does not like assigning to
varargout
in the exact same way thatdeal()
does haha. -
eacousineau about 10 years
-
eacousineau about 10 yearsYou can use anonymous handles as well:
cellexpand = @(x) x{:}; numexpand = @(x) cellexpand(num2cell(x));
. An example:[a, b] = numexpand([1, 2]);
. More specific example:[edge.weight] = numexpand([edge.weight] + 50);