Why aren't variable-length arrays part of the C++ standard?

断了今生、忘了曾经 提交于 2019-11-25 21:34:02

问题


I haven\'t used C very much in the last few years. When I read this question today I came across some C syntax which I wasn\'t familiar with.

Apparently in C99 the following syntax is valid:

void foo(int n) {
    int values[n]; //Declare a variable length array
}

This seems like a pretty useful feature. Was there ever a discussion about adding it to the C++ standard, and if so, why it was omitted?

Some potential reasons:

  • Hairy for compiler vendors to implement
  • Incompatible with some other part of the standard
  • Functionality can be emulated with other C++ constructs

The C++ standard states that array size must be a constant expression (8.3.4.1).

Yes, of course I realize that in the toy example one could use std::vector<int> values(m);, but this allocates memory from the heap and not the stack. And if I want a multidimensional array like:

void foo(int x, int y, int z) {
    int values[x][y][z]; // Declare a variable length array
}

the vector version becomes pretty clumsy:

void foo(int x, int y, int z) {
    vector< vector< vector<int> > > values( /* Really painful expression here. */);
}

The slices, rows and columns will also potentially be spread all over memory.

Looking at the discussion at comp.std.c++ it\'s clear that this question is pretty controversial with some very heavyweight names on both sides of the argument. It\'s certainly not obvious that a std::vector is always a better solution.


回答1:


There recently was a discussion about this kicked off in usenet: Why no VLAs in C++0x.

I agree with those people that seem to agree that having to create a potential large array on the stack, which usually has only little space available, isn't good. The argument is, if you know the size beforehand, you can use a static array. And if you don't know the size beforehand, you will write unsafe code.

C99 VLAs could provide a small benefit of being able to create small arrays without wasting space or calling constructors for unused elements, but they will introduce rather large changes to the type system (you need to be able to specify types depending on runtime values - this does not yet exist in current C++, except for new operator type-specifiers, but they are treated specially, so that the runtime-ness doesn't escape the scope of the new operator).

You can use std::vector, but it is not quite the same, as it uses dynamic memory, and making it use one's own stack-allocator isn't exactly easy (alignment is an issue, too). It also doesn't solve the same problem, because a vector is a resizable container, whereas VLAs are fixed-size. The C++ Dynamic Array proposal is intended to introduce a library based solution, as alternative to a language based VLA. However, it's not going to be part of C++0x, as far as I know.




回答2:


(Background: I have some experience implementing C and C++ compilers.)

Variable-length arrays in C99 were basically a misstep. In order to support VLAs, C99 had to make the following concessions to common sense:

  • sizeof x is no longer always a compile-time constant; the compiler must sometimes generate code to evaluate a sizeof-expression at runtime.

  • Allowing two-dimensional VLAs (int A[x][y]) required a new syntax for declaring functions that take 2D VLAs as parameters: void foo(int n, int A[][*]).

  • Less importantly in the C++ world, but extremely important for C's target audience of embedded-systems programmers, declaring a VLA means chomping an arbitrarily large chunk of your stack. This is a guaranteed stack-overflow and crash. (Anytime you declare int A[n], you're implicitly asserting that you have 2GB of stack to spare. After all, if you know "n is definitely less than 1000 here", then you would just declare int A[1000]. Substituting the 32-bit integer n for 1000 is an admission that you have no idea what the behavior of your program ought to be.)

Okay, so let's move to talking about C++ now. In C++, we have the same strong distinction between "type system" and "value system" that C89 does… but we've really started to rely on it in ways that C has not. For example:

template<typename T> struct S { ... };
int A[n];
S<decltype(A)> s;  // equivalently, S<int[n]> s;

If n weren't a compile-time constant (i.e., if A were of variably modified type), then what on earth would be the type of S? Would S's type also be determined only at runtime?

What about this:

template<typename T> bool myfunc(T& t1, T& t2) { ... };
int A1[n1], A2[n2];
myfunc(A1, A2);

The compiler must generate code for some instantiation of myfunc. What should that code look like? How can we statically generate that code, if we don't know the type of A1 at compile time?

Worse, what if it turns out at runtime that n1 != n2, so that !std::is_same<decltype(A1), decltype(A2)>()? In that case, the call to myfunc shouldn't even compile, because template type deduction should fail! How could we possibly emulate that behavior at runtime?

Basically, C++ is moving in the direction of pushing more and more decisions into compile-time: template code generation, constexpr function evaluation, and so on. Meanwhile, C99 was busy pushing traditionally compile-time decisions (e.g. sizeof) into the runtime. With this in mind, does it really even make sense to expend any effort trying to integrate C99-style VLAs into C++?

As every other answerer has already pointed out, C++ provides lots of heap-allocation mechanisms (std::unique_ptr<int[]> A = new int[n]; or std::vector<int> A(n); being the obvious ones) when you really want to convey the idea "I have no idea how much RAM I might need." And C++ provides a nifty exception-handling model for dealing with the inevitable situation that the amount of RAM you need is greater than the amount of RAM you have. But hopefully this answer gives you a good idea of why C99-style VLAs were not a good fit for C++ — and not really even a good fit for C99. ;)


For more on the topic, see N3810 "Alternatives for Array Extensions", Bjarne Stroustrup's October 2013 paper on VLAs. Bjarne's POV is very different from mine; N3810 focuses more on finding a good C++ish syntax for the things, and on discouraging the use of raw arrays in C++, whereas I focused more on the implications for metaprogramming and the typesystem. I don't know if he considers the metaprogramming/typesystem implications solved, solvable, or merely uninteresting.


A good blog post that hits many of these same points is "Legitimate Use of Variable Length Arrays" (Chris Wellons, 2019-10-27).




回答3:


You could always use alloca() to allocate memory on the stack at runtime, if you wished:

void foo (int n)
{
    int *values = (int *)alloca(sizeof(int) * n);
}

Being allocated on the stack implies that it will automatically be freed when the stack unwinds.

Quick note: As mentioned in the Mac OS X man page for alloca(3), "The alloca() function is machine and compiler dependent; its use is dis-couraged." Just so you know.




回答4:


In my own work, I've realized that every time I've wanted something like variable-length automatic arrays or alloca(), I didn't really care that the memory was physically located on the cpu stack, just that it came from some stack allocator that didn't incur slow trips to the general heap. So I have a per-thread object that owns some memory from which it can push/pop variable sized buffers. On some platforms I allow this to grow via mmu. Other platforms have a fixed size (usually accompanied by a fixed size cpu stack as well because no mmu). One platform I work with (a handheld game console) has precious little cpu stack anyway because it resides in scarce, fast memory.

I'm not saying that pushing variable-sized buffers onto the cpu stack is never needed. Honestly I was surprised back when I discovered this wasn't standard, as it certainly seems like the concept fits into the language well enough. For me though, the requirements "variable size" and "must be physically located on the cpu stack" have never come up together. It's been about speed, so I made my own sort of "parallel stack for data buffers".




回答5:


Seems it will be available in C++14:

https://en.wikipedia.org/wiki/C%2B%2B14#Runtime-sized_one_dimensional_arrays

Update: It did not make it into C++14.




回答6:


There are situations where allocating heap memory is very expensive compared to the operations performed. An example is matrix math. If you work with smallish matrices say 5 to 10 elements and do a lot of arithmetics the malloc overhead will be really significant. At the same time making the size a compile time constant does seem very wasteful and inflexible.

I think that C++ is so unsafe in itself that the argument to "try to not add more unsafe features" is not very strong. On the other hand, as C++ is arguably the most runtime efficient programming language features which makes it more so are always useful: People who write performance critical programs will to a large extent use C++, and they need as much performance as possible. Moving stuff from heap to stack is one such possibility. Reducing the number of heap blocks is another. Allowing VLAs as object members would one way to achieve this. I'm working on such a suggestion. It is a bit complicated to implement, admittedly, but it seems quite doable.




回答7:


This was considered for inclusion in C++/1x, but was dropped (this is a correction to what I said earlier).

It would be less useful in C++ anyway since we already have std::vector to fill this role.




回答8:


Use std::vector for this. For example:

std::vector<int> values;
values.resize(n);

The memory will be allocated on the heap, but this holds only a small performance drawback. Furthermore, it is wise not to allocate large datablocks on the stack, as it is rather limited in size.




回答9:


C99 allows VLA. And it puts some restrictions on how to declare VLA. For details, refer to 6.7.5.2 of the standard. C++ disallows VLA. But g++ allows it.




回答10:


Arrays like this are part of C99, but not part of standard C++. as others have said, a vector is always a much better solution, which is probably why variable sized arrays are not in the C++ standatrd (or in the proposed C++0x standard).

BTW, for questions on "why" the C++ standard is the way it is, the moderated Usenet newsgroup comp.std.c++ is the place to go to.




回答11:


If you know the value at compile time you can do the following:

template <int X>
void foo(void)
{
   int values[X];

}

Edit: You can create an a vector that uses a stack allocator (alloca), since the allocator is a template parameter.




回答12:


I have a solution that actually worked for me. I did not want to allocate memory because of fragmentation on a routine that needed to run many times. The answer is extremely dangerous, so use it at your own risk, but it takes advantage of assembly to reserve space on the stack. My example below uses a character array (obviously other sized variable would require more memory).

void varTest(int iSz)
{
    char *varArray;
    __asm {
        sub esp, iSz       // Create space on the stack for the variable array here
        mov varArray, esp  // save the end of it to our pointer
    }

    // Use the array called varArray here...  

    __asm {
        add esp, iSz       // Variable array is no longer accessible after this point
    } 
}

The dangers here are many but I'll explain a few: 1. Changing the variable size half way through would kill the stack position 2. Overstepping the array bounds would destroy other variables and possible code 3. This does not work in a 64 bit build... need different assembly for that one (but a macro might solve that problem). 4. Compiler specific (may have trouble moving between compilers). I haven't tried so I really don't know.



来源:https://stackoverflow.com/questions/1887097/why-arent-variable-length-arrays-part-of-the-c-standard

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!