Is MATLAB OOP slow or am I doing something wrong?

28,970

Solution 1

I've been working with OO MATLAB for a while, and ended up looking at similar performance issues.

The short answer is: yes, MATLAB's OOP is kind of slow. There is substantial method call overhead, higher than mainstream OO languages, and there's not much you can do about it. Part of the reason may be that idiomatic MATLAB uses "vectorized" code to reduce the number of method calls, and per-call overhead is not a high priority.

I benchmarked the performance by writing do-nothing "nop" functions as the various types of functions and methods. Here are some typical results.

>> call_nops
Computer: PCWIN   Release: 2009b
Calling each function/method 100000 times
nop() function:                 0.02261 sec   0.23 usec per call
nop1-5() functions:             0.02182 sec   0.22 usec per call
nop() subfunction:              0.02244 sec   0.22 usec per call
@()[] anonymous function:       0.08461 sec   0.85 usec per call
nop(obj) method:                0.24664 sec   2.47 usec per call
nop1-5(obj) methods:            0.23469 sec   2.35 usec per call
nop() private function:         0.02197 sec   0.22 usec per call
classdef nop(obj):              0.90547 sec   9.05 usec per call
classdef obj.nop():             1.75522 sec  17.55 usec per call
classdef private_nop(obj):      0.84738 sec   8.47 usec per call
classdef nop(obj) (m-file):     0.90560 sec   9.06 usec per call
classdef class.staticnop():     1.16361 sec  11.64 usec per call
Java nop():                     2.43035 sec  24.30 usec per call
Java static_nop():              0.87682 sec   8.77 usec per call
Java nop() from Java:           0.00014 sec   0.00 usec per call
MEX mexnop():                   0.11409 sec   1.14 usec per call
C nop():                        0.00001 sec   0.00 usec per call

Similar results on R2008a through R2009b. This is on Windows XP x64 running 32-bit MATLAB.

The "Java nop()" is a do-nothing Java method called from within an M-code loop, and includes the MATLAB-to-Java dispatch overhead with each call. "Java nop() from Java" is the same thing called in a Java for() loop and doesn't incur that boundary penalty. Take the Java and C timings with a grain of salt; a clever compiler could optimize the calls away completely.

The package scoping mechanism is new, introduced at about the same time as the classdef classes. Its behavior may be related.

A few tentative conclusions:

  • Methods are slower than functions.
  • New style (classdef) methods are slower than old style methods.
  • The new obj.nop() syntax is slower than the nop(obj) syntax, even for the same method on a classdef object. Same for Java objects (not shown). If you want to go fast, call nop(obj).
  • Method call overhead is higher (about 2x) in 64-bit MATLAB on Windows. (Not shown.)
  • MATLAB method dispatch is slower than some other languages.

Saying why this is so would just be speculation on my part. The MATLAB engine's OO internals aren't public. It's not an interpreted vs compiled issue per se - MATLAB has a JIT - but MATLAB's looser typing and syntax may mean more work at run time. (E.g. you can't tell from syntax alone whether "f(x)" is a function call or an index into an array; it depends on the state of the workspace at run time.) It may be because MATLAB's class definitions are tied to filesystem state in a way that many other languages' are not.

So, what to do?

An idiomatic MATLAB approach to this is to "vectorize" your code by structuring your class definitions such that an object instance wraps an array; that is, each of its fields hold parallel arrays (called "planar" organization in the MATLAB documentation). Rather than having an array of objects, each with fields holding scalar values, define objects which are themselves arrays, and have the methods take arrays as inputs, and make vectorized calls on the fields and inputs. This reduces the number of method calls made, hopefully enough that the dispatch overhead is not a bottleneck.

Mimicking a C++ or Java class in MATLAB probably won't be optimal. Java/C++ classes are typically built such that objects are the smallest building blocks, as specific as you can (that is, lots of different classes), and you compose them in arrays, collection objects, etc, and iterate over them with loops. To make fast MATLAB classes, turn that approach inside out. Have larger classes whose fields are arrays, and call vectorized methods on those arrays.

The point is to arrange your code to play to the strengths of the language - array handling, vectorized math - and avoid the weak spots.

EDIT: Since the original post, R2010b and R2011a have come out. The overall picture is the same, with MCOS calls getting a bit faster, and Java and old-style method calls getting slower.

EDIT: I used to have some notes here on "path sensitivity" with an additional table of function call timings, where function times were affected by how the Matlab path was configured, but that appears to have been an aberration of my particular network setup at the time. The chart above reflects the times typical of the preponderance of my tests over time.

Update: R2011b

EDIT (2/13/2012): R2011b is out, and the performance picture has changed enough to update this.

Arch: PCWIN   Release: 2011b 
Machine: R2011b, Windows XP, 8x Core i7-2600 @ 3.40GHz, 3 GB RAM, NVIDIA NVS 300
Doing each operation 100000 times
style                           total       µsec per call
nop() function:                 0.01578      0.16
nop(), 10x loop unroll:         0.01477      0.15
nop(), 100x loop unroll:        0.01518      0.15
nop() subfunction:              0.01559      0.16
@()[] anonymous function:       0.06400      0.64
nop(obj) method:                0.28482      2.85
nop() private function:         0.01505      0.15
classdef nop(obj):              0.43323      4.33
classdef obj.nop():             0.81087      8.11
classdef private_nop(obj):      0.32272      3.23
classdef class.staticnop():     0.88959      8.90
classdef constant:              1.51890     15.19
classdef property:              0.12992      1.30
classdef property with getter:  1.39912     13.99
+pkg.nop() function:            0.87345      8.73
+pkg.nop() from inside +pkg:    0.80501      8.05
Java obj.nop():                 1.86378     18.64
Java nop(obj):                  0.22645      2.26
Java feval('nop',obj):          0.52544      5.25
Java Klass.static_nop():        0.35357      3.54
Java obj.nop() from Java:       0.00010      0.00
MEX mexnop():                   0.08709      0.87
C nop():                        0.00001      0.00
j() (builtin):                  0.00251      0.03

I think the upshot of this is that:

  • MCOS/classdef methods are faster. Cost is now about on par with old style classes, as long as you use the foo(obj) syntax. So method speed is no longer a reason to stick with old style classes in most cases. (Kudos, MathWorks!)
  • Putting functions in namespaces makes them slow. (Not new in R2011b, just new in my test.)

Update: R2014a

I've reconstructed the benchmarking code and run it on R2014a.

Matlab R2014a on PCWIN64  
Matlab 8.3.0.532 (R2014a) / Java 1.7.0_11 on PCWIN64 Windows 7 6.1 (eilonwy-win7) 
Machine: Core i7-3615QM CPU @ 2.30GHz, 4 GB RAM (VMware Virtual Platform)
nIters = 100000 

Operation                        Time (µsec)  
nop() function:                         0.14 
nop() subfunction:                      0.14 
@()[] anonymous function:               0.69 
nop(obj) method:                        3.28 
nop() private fcn on @class:            0.14 
classdef nop(obj):                      5.30 
classdef obj.nop():                    10.78 
classdef pivate_nop(obj):               4.88 
classdef class.static_nop():           11.81 
classdef constant:                      4.18 
classdef property:                      1.18 
classdef property with getter:         19.26 
+pkg.nop() function:                    4.03 
+pkg.nop() from inside +pkg:            4.16 
feval('nop'):                           2.31 
feval(@nop):                            0.22 
eval('nop'):                           59.46 
Java obj.nop():                        26.07 
Java nop(obj):                          3.72 
Java feval('nop',obj):                  9.25 
Java Klass.staticNop():                10.54 
Java obj.nop() from Java:               0.01 
MEX mexnop():                           0.91 
builtin j():                            0.02 
struct s.foo field access:              0.14 
isempty(persistent):                    0.00 

Update: R2015b: Objects got faster!

Here's R2015b results, kindly provided by @Shaked. This is a big change: OOP is significantly faster, and now the obj.method() syntax is as fast as method(obj), and much faster than legacy OOP objects.

Matlab R2015b on PCWIN64  
Matlab 8.6.0.267246 (R2015b) / Java 1.7.0_60 on PCWIN64 Windows 8 6.2 (nanit-shaked) 
Machine: Core i7-4720HQ CPU @ 2.60GHz, 16 GB RAM (20378)
nIters = 100000 

Operation                        Time (µsec)  
nop() function:                         0.04 
nop() subfunction:                      0.08 
@()[] anonymous function:               1.83 
nop(obj) method:                        3.15 
nop() private fcn on @class:            0.04 
classdef nop(obj):                      0.28 
classdef obj.nop():                     0.31 
classdef pivate_nop(obj):               0.34 
classdef class.static_nop():            0.05 
classdef constant:                      0.25 
classdef property:                      0.25 
classdef property with getter:          0.64 
+pkg.nop() function:                    0.04 
+pkg.nop() from inside +pkg:            0.04 
feval('nop'):                           8.26 
feval(@nop):                            0.63 
eval('nop'):                           21.22 
Java obj.nop():                        14.15 
Java nop(obj):                          2.50 
Java feval('nop',obj):                 10.30 
Java Klass.staticNop():                24.48 
Java obj.nop() from Java:               0.01 
MEX mexnop():                           0.33 
builtin j():                            0.15 
struct s.foo field access:              0.25 
isempty(persistent):                    0.13 

Update: R2018a

Here's R2018a results. It's not the huge jump that we saw when the new execution engine was introduced in R2015b, but it's still an appreciable year over year improvement. Notably, anonymous function handles got way faster.

Matlab R2018a on MACI64  
Matlab 9.4.0.813654 (R2018a) / Java 1.8.0_144 on MACI64 Mac OS X 10.13.5 (eilonwy) 
Machine: Core i7-3615QM CPU @ 2.30GHz, 16 GB RAM 
nIters = 100000 

Operation                        Time (µsec)  
nop() function:                         0.03 
nop() subfunction:                      0.04 
@()[] anonymous function:               0.16 
classdef nop(obj):                      0.16 
classdef obj.nop():                     0.17 
classdef pivate_nop(obj):               0.16 
classdef class.static_nop():            0.03 
classdef constant:                      0.16 
classdef property:                      0.13 
classdef property with getter:          0.39 
+pkg.nop() function:                    0.02 
+pkg.nop() from inside +pkg:            0.02 
feval('nop'):                          15.62 
feval(@nop):                            0.43 
eval('nop'):                           32.08 
Java obj.nop():                        28.77 
Java nop(obj):                          8.02 
Java feval('nop',obj):                 21.85 
Java Klass.staticNop():                45.49 
Java obj.nop() from Java:               0.03 
MEX mexnop():                           3.54 
builtin j():                            0.10 
struct s.foo field access:              0.16 
isempty(persistent):                    0.07 

Update: R2018b and R2019a: No change

No significant changes. I'm not bothering to include the test results.

Update: R2021a: Even faster objects!

Looks like classdef objects have gotten significantly faster again. But structs have gotten slower.

Matlab R2021a on MACI64  
Matlab 9.10.0.1669831 (R2021a) Update 2 / Java 1.8.0_202 on MACI64 Mac OS X 10.14.6 (eilonwy) 
Machine: Core i7-3615QM CPU @ 2.30GHz, 4 cores, 16 GB RAM 
nIters = 100000 

Operation                        Time (μsec)  
nop() function:                         0.03 
nop() subfunction:                      0.04 
@()[] anonymous function:               0.14 
nop(obj) method:                        6.65 
nop() private fcn on @class:            0.02 
classdef nop(obj):                      0.03 
classdef obj.nop():                     0.04 
classdef pivate_nop(obj):               0.03 
classdef class.static_nop():            0.03 
classdef constant:                      0.16 
classdef property:                      0.12 
classdef property with getter:          0.17 
+pkg.nop() function:                    0.02 
+pkg.nop() from inside +pkg:            0.02 
feval('nop'):                          14.45 
feval(@nop):                            0.59 
eval('nop'):                           23.59 
Java obj.nop():                        30.01 
Java nop(obj):                          6.80 
Java feval('nop',obj):                 18.17 
Java Klass.staticNop():                16.77 
Java obj.nop() from Java:               0.02 
MEX mexnop():                           2.51 
builtin j():                            0.21 
struct s.foo field access:              0.29 
isempty(persistent):                    0.26 

Source Code for Benchmarks

I've put the source code for these benchmarks up on GitHub, released under the MIT License. https://github.com/apjanke/matlab-bench

Solution 2

The handle class has an additional overhead from tracking all of references to itself for cleanup purposes.

Try the same experiment without using the handle class and see what your results are.

Solution 3

OO performance depends significantly on the MATLAB version used. I cannot comment on all versions, but know from experience that 2012a is much improved over 2010 versions. No benchmarks and so no numbers to present. My code, exclusively written using handle classes and written under 2012a will not run at all under earlier versions.

Solution 4

Actually no problem with your code but it is a problem with Matlab. I think in it is a kind of playing around to look like. It is nothing than overhead to compile the class code. I have done the test with simple class point (once as handle) and the other (once as value class)

    classdef Pointh < handle
    properties
       X
       Y
    end  
    methods        
        function p = Pointh (x,y)
            p.X = x;
            p.Y = y;
        end        
        function  d = dist(p,p1)
            d = (p.X - p1.X)^2 + (p.Y - p1.Y)^2 ;
        end

    end
end

here is the test

%handle points 
ph = Pointh(1,2);
ph1 = Pointh(2,3);

%values  points 
p = Pointh(1,2);
p1 = Pointh(2,3);

% vector points
pa1 = [1 2 ];
pa2 = [2 3 ];

%Structur points 
Ps.X = 1;
Ps.Y = 2;
ps1.X = 2;
ps1.Y = 3;

N = 1000000;

tic
for i =1:N
    ph.dist(ph1);
end
t1 = toc

tic
for i =1:N
    p.dist(p1);
end
t2 = toc

tic
for i =1:N
    norm(pa1-pa2)^2;
end
t3 = toc

tic
for i =1:N
    (Ps.X-ps1.X)^2+(Ps.Y-ps1.Y)^2;
end
t4 = toc

The results t1 =

12.0212 % Handle

t2 =

12.0042 % value

t3 =

0.5489  % vector

t4 =

0.0707 % structure 

Therefore for efficient performance avoid using OOP instead structure is good choice to group variables

Share:
28,970

Related videos on Youtube

stijn
Author by

stijn

Updated on September 18, 2021

Comments

  • stijn
    stijn over 2 years

    I'm experimenting with MATLAB OOP, as a start I mimicked my C++'s Logger classes and I'm putting all my string helper functions in a String class, thinking it would be great to be able to do things like a + b, a == b, a.find( b ) instead of strcat( a b ), strcmp( a, b ), retrieve first element of strfind( a, b ), etc.

    The problem: slowdown

    I put the above things to use and immediately noticed a drastic slowdown. Am I doing it wrong (which is certainly possible as I have rather limited MATLAB experience), or does MATLAB's OOP just introduce a lot of overhead?

    My test case

    Here's the simple test I did for string, basically just appending a string and removing the appended part again:

    Note: Don't actually write a String class like this in real code! Matlab has a native string array type now, and you should use that instead.

    classdef String < handle
      ....
      properties
        stringobj = '';
      end
      function o = plus( o, b )
        o.stringobj = [ o.stringobj b ];
      end
      function n = Length( o )
        n = length( o.stringobj );
      end
      function o = SetLength( o, n )
        o.stringobj = o.stringobj( 1 : n );
      end
    end
    
    function atest( a, b ) %plain functions
      n = length( a );
      a = [ a b ];
      a = a( 1 : n );
    
    function btest( a, b ) %OOP
      n = a.Length();
      a = a + b;
      a.SetLength( n );
    
    function RunProfilerLoop( nLoop, fun, varargin )
      profile on;
      for i = 1 : nLoop
        fun( varargin{ : } );
      end
      profile off;
      profile report;
    
    a = 'test';
    aString = String( 'test' );
    RunProfilerLoop( 1000, @(x,y)atest(x,y), a, 'appendme' );
    RunProfilerLoop( 1000, @(x,y)btest(x,y), aString, 'appendme' );
    

    The results

    Total time in seconds, for 1000 iterations:

    btest 0.550 (with String.SetLength 0.138, String.plus 0.065, String.Length 0.057)

    atest 0.015

    Results for the logger system are likewise: 0.1 seconds for 1000 calls to frpintf( 1, 'test\n' ), 7 (!) seconds for 1000 calls to my system when using the String class internally (OK, it has a lot more logic in it, but to compare with C++: the overhead of my system that uses std::string( "blah" ) and std::cout at the output side vs plain std::cout << "blah" is on the order of 1 millisecond.)

    Is it just overhead when looking up class/package functions?

    Since MATLAB is interpreted, it has to look up the definition of a function/object at run time. So I was wondering that maybe much more overhead is involved in looking up class or package function vs functions that are in the path. I tried to test this, and it just gets stranger. To rule out the influence of classes/objects, I compared calling a function in the path vs a function in a package:

    function n = atest( x, y )
      n = ctest( x, y ); % ctest is in matlab path
    
    function n = btest( x, y )
      n = util.ctest( x, y ); % ctest is in +util directory, parent directory is in path
    

    Results, gathered same way as above:

    atest 0.004 sec, 0.001 sec in ctest

    btest 0.060 sec, 0.014 sec in util.ctest

    So, is all this overhead just coming from MATLAB spending time looking up definitions for its OOP implementation, whereas this overhead is not there for functions that are directly in the path?

    • Mikhail Poda
      Mikhail Poda over 14 years
      Thank you for this question! Performance of Matlab heap (OOP/closures) has troubled me for years, see stackoverflow.com/questions/1446281/matlabs-garbage-collecto‌​r. I am really curious what MatlabDoug/Loren/MikeKatz will respond to your post.
    • stijn
      stijn over 14 years
      ^ that was an interesting read.
    • MatlabDoug
      MatlabDoug over 14 years
      @Mikhail I do almost nothing with OOP in MATLAB, so I have nothing to add.
    • Mikhail Poda
      Mikhail Poda over 14 years
      @MatlabDoug: maybe your colleague Mike Karr can comment OP?
    • Amro
      Amro almost 12 years
      Readers should also check this recent blog post (by Dave Foti) discussing OOP performance in latest R2012a version: Considering Performance in Object-Oriented MATLAB Code
    • Jose Ospina
      Jose Ospina almost 11 years
      A simple example of the sensitivity on code structure in which the call of methods of subelements is taken out of the loop. for i = 1:this.get_n_quantities() if(strcmp(id,this.get_quantity_rlz(i).get_id())) ix = i; end end takes 2.2 sec, while nq = this.get_n_quantities(); a = this.get_quantity_realizations(); for i = 1:nq c = a{i}; if(strcmp(id,c.get_id())) ix = i; end end takes 0.01, two orders of mag
    • stijn
      stijn about 7 years
      7 years later and the first downvote. Somebody must have been cranky today..
    • Andrew Janke
      Andrew Janke over 4 years
      Oh, BTW, @stijn: I was just now re-reading this and noticed that you asked this because you were doing a logger implementation. I did a Matlab logging framework/layer too; come by and check it out if you're interested: github.com/apjanke/SLF4M. I think it's pretty decent; I've been using variants of this in production for almost 15 years now.
  • stijn
    stijn over 14 years
    exactly the same experiment with String, but now as a value class (on another machine though); atest: 0.009, btest: o.356. That is basically the same difference as with the handle, so I do not think tracking references is the key answer. It also does not explain the overhead in functions vs function in packages.
  • MikeEL
    MikeEL over 14 years
    What version of matlab are you using?
  • RjOllos
    RjOllos over 13 years
    I've run some similar comparisons between handle and value classes and have not noticed a performance difference between the two.
  • MikeEL
    MikeEL over 13 years
    I no longer notice a difference either.
  • Dang Khoa
    Dang Khoa about 12 years
    @AndrewJanke Do you think you could run the benchmark again with R2012a? This is really interesting.
  • Andrew Janke
    Andrew Janke over 11 years
    Sorry Dang, I don't have a Matlab license at the moment so can't do an updated benchmark run.
  • Jonas
    Jonas over 11 years
    @AndrewJanke: If you send me the test suite, I can do an update with R2012b.
  • Andrew Janke
    Andrew Janke over 11 years
    Sorry, don't have the source either; left it at my old job. It's not hard to reconstruct, though: it's just a bunch of calls to trivial functions of various styles, with a warm-up pass and a timed pass of 100000 calls each.
  • Andrew Janke
    Andrew Janke about 10 years
    Hi folks. If you're still interested in the source code, I've reconstructed it and open-sourced it on GitHub. github.com/apjanke/matlab-bench
  • Dang Khoa
    Dang Khoa about 10 years
    I just clicked this question again to refresh my memory on it, and I'm glad to see it's been updated! Thank you :)
  • Leila
    Leila about 9 years
    @AndrewJanke Do you know about static methods? In order to have fewer source files and improve the readability of my code I have defined static methods that are called often. Am I better off using functions?
  • Andrew Janke
    Andrew Janke about 9 years
    @Seeda: Static methods are listed as "classdef class.static_nop()" in these results. They are quite slow compared to functions. If they're not called frequently, that doesn't matter.
  • bastibe
    bastibe over 8 years
    The new execution engine in 2015b has changed this picture dramatically! Method calls are now very similar to function calls, while Java interop and feval have gotten slower.
  • Andrew Janke
    Andrew Janke over 8 years
    That is good news! Mostly. I'll pick up R2015b in the next couple weeks and update these results. Or if you want to run the benchmark code yourself and post the results as a gist, I'll incorporate them.
  • Shaked
    Shaked over 8 years
  • Andrew Janke
    Andrew Janke over 8 years
    Wow! If those results hold up, I might need to revise this whole answer. Added. Thanks!
  • Andrew Janke
    Andrew Janke about 5 years
    Makes sense: in Matlab, all arrays, not just handle objects, are reference-counted, because they use copy-on-write and shared underlying raw data.
  • Jan
    Jan over 4 years
    I created a PR which includes tests similar to the Java class tests, but in .NET 4.5 and .NET Standard 2.0. The results are interesting and comparable to the nop(obj) vs obj.nop() issue, seen in former Matlab versions, very interesting. Net 4.5 obj.nop(): 35.57 vs Net 4.5 nop(obj): 8.72
  • Andrew Janke
    Andrew Janke over 4 years
    @Jan: Good idea. Merged. Thanks!
  • Peter Cordes
    Peter Cordes over 4 years
    Take the Java and C timings with a grain of salt; a clever compiler could optimize the calls away completely. True, but function inlining is what makes C++-style programming with lots of trivial wrapper functions usable. The overhead truly does optimize away entirely with ahead-of-time compilation to machine code. (And Java can also inline, although more complexity means more work during startup for the JIT compiler)
  • Andrew Janke
    Andrew Janke over 2 years
    Added updates for R2021a: much faster classdef objects, somewhat slower structs.
  • dleal
    dleal over 2 years
    I really enjoy these updates, please keep them coming!
  • Andrew Janke
    Andrew Janke over 2 years
    Thanks! I enjoy doing them! But they're maybe about done: I'm a macOS 10.14 user because I use Aperture, and MathWorks has dropped support for macOS 10.14 for newer Matlab releases. I wouldn't want to post results from a VM because that'd skew the numbers.