C++ std::vector vs array in the real world

24,033

Solution 1

A: Almost always [use a vector instead of an array]. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.

That's an over-simplification. It's fairly common to use arrays, and can be attractive when:

  • the elements are specified at compile time, e.g. const char project[] = "Super Server";, const Colours colours[] = { Green, Yellow };

    • with C++11 it will be equally concise to initialise std::vectors with values

  • the number of elements is inherently fixed, e.g. const char* const bool_to_str[] = { "false", "true" };, Piece chess_board[8][8];

  • first-use performance is critical: with arrays of constants the compiler can often write a memory snapshot of the fully pre-initialised objects into the executable image, which is then page-faulted directly into place ready for use, so it's typically much faster that run-time heap allocation (new[]) followed by serialised construction of objects

    • compiler-generated tables of const data can always be safely read by multiple threads, whereas data constructed at run-time must complete construction before other code triggered by constructors for non-function-local static variables attempts to use that data: you end up needing some manner of Singleton (possibly threadsafe which will be even slower)

    • In C++03, vectors created with an initial size would construct one prototypical element object then copy construct each data member. That meant that even for types where construction was deliberately left as a no-operation, there was still a cost to copy the data elements - replicating their whatever-garbage-was-left-in-memory values. Clearly an array of uninitialised elements is faster.

  • One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.

  • There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?

  • classes/structures containing arrays can still be POD types

  • arrays facilitate access in shared memory from multiple processes (by default vector's internal pointers to the actual dynamically allocated data won't be in shared memory or meaningful across processes, and it was famously difficult to force C++03 vectors to use shared memory like this even when specifying a custom allocator template parameter).

  • embedding arrays can localise memory access requirement, improving cache hits and therefore performance

That said, if it's not an active pain to use a vector (in code concision, readability or performance) then you're better off doing so: they've size(), checked random access via at(), iterators, resizing (which often becomes necessary as an application "matures") etc.. It's also often easier to change from vector to some other Standard container should there be a need, and safer/easier to apply Standard algorithms (x.end() is better than x + sizeof x / sizeof x[0] any day).

UPDATE: C++11 introduced a std::array<>, which avoids some of the costs of vectors - internally using a fixed-sized array to avoid an extra heap allocation/deallocation - while offering some of the benefits and API features: http://en.cppreference.com/w/cpp/container/array.

Solution 2

One of the best reasons to use a vector as opposed to an array is the RAII idiom. Basically, in order for c++ code to be exception-safe, any dynamically allocated memory or other resources should be encapsulated within objects. These objects should have destructors that free these resources.

When an exception goes unhandled, the ONLY things that are gaurenteed to be called are the destructors of objects on the stack. If you dynamically allocate memory outside of an object, and an uncaught exception is thrown somewhere before it is deleted, you have a memory leak.

It's also a nice way to avoid having to remember to use delete.

You should also check out std::algorithm, which provides a lot of common algorithms for vector and other STL containers.

I have on a few occasions written code with vector that, in retrospect, probably would have been better with a native array. But in all of these cases, either a Boost::multi_array or a Blitz::Array would have been better than either of them.

Solution 3

A std::vector is just a resizable array. It's not much more than that. It's not something you would learn in a Data Structures class, because it isn't an intelligent data structure.

In the real world, I see a lot of arrays. But I also see a lot of legacy codebases that use "C with Classes"-style C++ programming. That doesn't mean that you should program that way.

Solution 4

I am going to pop my opinion in here for coding large sized array/vectors used in science and engineering.

The pointer based arrays in this case can be quite a bit faster especially for standard types. But the pointers add the danger of possible memory leaks. These memory leaks can lead to longer debug cycle. Additionally if you want to make the pointer based array dynamic you have to code this by hand.

On the other hand vectors are slower for standard types. They also are both dynamic and memory safe as long as you are not storing dynamically allocated pointers in the stl vector.

In science and engineering the choice depends on the project. how important is speed vs debug time? For example LAAMPS which is a simulation software uses raw pointers that are handled through their memory management class. Speed is priority for this software. A software I am building, i have to balance speed, with memory footprint and debug time. I really dont want to spend a lot of time debugging so i am using the STL vector.

I wanted to add some more information to this answer that I discovered from extensive testing of large scale arrays and lots of reading the web. So, another problem with stl vector and large sized arrays (one million +) occurs in how memory gets allocated for these arrays. Stl vector uses the std::allocator class for handling memory. This class is a pool based memory allocator. Under small scale loading the pool based allocation is extremely efficient in terms of speed and memory use. As the size of the vector gets into the millions, the pool based strategy becomes a memory hog. This happens because the pools tendency is to always hold more space than is being currently used by the stl vector.

For large scale vectors you are either better off writing your own vector class or using pointers (raw or some sort of memory management system from boost or the c++ library). There are advantages and disadvantages to both approaches. The choice really depends on the exact problem you are tackling (too many variables to add in here). If you do happen to write your own vector class make sure to allow the vector an easy way to clear its memory. Currently for the Stl vector you need to use swap operations to do something that really should have been built into the class in the first place.

Solution 5

Rule of thumb: if you don't know the number of elements in advance, or if the number of elements is expected to be large (say, more than 10), use vector. Otherwise, you could also use an array. For example, I write a lot of geometry-processing code and I define a line as an ARRAY of 2 coordinates. A line is defined by two points, and it will ALWAYS be defined by exactly two points. Using a vector instead of an array would be overkill in many ways, also performance-wise.

Another thing: when I say "array" I really DO MEAN array: a variable declared using an array syntax, such as int evenOddCount[2]; If you consider choosing between a vector and a dynamically-allocated block of memory, such as int *evenOddCount = new int[2];, the answer is clear: USE VECTOR!

Share:
24,033

Related videos on Youtube

GRardB
Author by

GRardB

Updated on July 09, 2022

Comments

  • GRardB
    GRardB almost 2 years

    I'm new to C++. I'm reading "Beginning C++ Through Game Programming" by Michael Dawson. However, I'm not new to programming in general. I just finished a chapter that dealt with vectors, so I've got a question about their use in the real world (I'm a computer science student, so I don't have much real-world experience yet).

    The author has a Q/A at the end of each chapter, and one of them was:

    Q: When should I use a vector instead of an array?

    A: Almost always. Vectors are efficient and flexible. They do require a little more memory than arrays, but this tradeoff is almost always worth the benefits.

    What do you guys think? I remember learning about vectors in a Java book, but we didn't cover them at all in my Intro to Comp. Sci. class, nor my Data Structures class at college. I've also never seen them used in any programming assignments (Java and C). This makes me feel like they're not used very much, although I know that school code and real-world code can be extremely different.

    I don't need to be told about the differences between the two data structures; I'm very aware of them. All I want to know is if the author is giving good advice in his Q/A, or if he's simply trying to save beginner programmers from destroying themselves with complexities of managing fixed-size data structures. Also, regardless of what you think of the author's advice, what do you see in the real-world more often?

    • Dair
      Dair about 13 years
      Well, you probably have never used vectors in C because, as far as I am aware C doesn't have generic programming, the STL, or vectors of its own, meaning you can only dynamically allocate arrays...
    • Xepo
      Xepo about 13 years
      I work for HP on a 2.5 million line code base. We strive to use vectors any time we need a resizeable array. I've never seen the STL used in Academia, and I'm not sure why, but trust me, they're definitely used in real-world programming.
    • Martin York
      Martin York about 13 years
      I think you are reading a book for a reason. Take the authors advice. read std::vector-is-so-much-slower-than-plain-arrays
    • Glenn Teitelbaum
      Glenn Teitelbaum almost 9 years
      The addition of std::array in C++11 adds quite a bit to making arrays more useful for fixed sized use cases
  • Matthieu M.
    Matthieu M. about 13 years
    int a[5]; is perfect exception safe. Are you sure that the OP is talking about dynamically allocated arrays ?
  • Matthieu M.
    Matthieu M. about 13 years
    I ticked on the flexible, it is usual in C code to have fixed length buffer allocated at the beginning of the method. I do think std::vector (or std::string) are superior alternative though :)
  • M.M
    M.M over 8 years
    Searching and sorting can operate on C-style arrays equally as well as they can on vectors
  • Assimilater
    Assimilater almost 8 years
    This answer is also underrated (the bit about large numbers is a very good point!!!)
  • James LT
    James LT over 5 years
    Great advice from science and engineering background!