I\'m optimizing a constructor that is called in one of our app\'s innermost loops. The class in question is about 100 bytes wide, consists of a bunch of int
s,
Here's how I would do it. Don't declare any constructor; instead, declare a fixed Frobozz that contains default values:
const Frobozz DefaultFrobozz =
{
0, 1, -1, 0, // int na,nb,nc,nd;
false, true, false, // bool ba,bb,bc;
'a', 'b', 'c', // char ca,cb,cc;
-1, 1.0 // float fa,fb;
} ;
Then in OversimplifiedExample
:
Frobozz params (DefaultFrobozz) ;
With gcc -O3
(version 4.5.2), the initialisation of params
reduces to:
leal -72(%ebp), %edi
movl $_DefaultFrobozz, %esi
movl $16, %ecx
rep movsl
which is about as good as it gets in a 32-bit environment.
Warning: I tried this with the 64-bit g++ version 4.7.0 20110827 (experimental), and it generated an explicit sequence of 64-bit copies instead of a block move. The processor doesn't allow rep movsq
, but I would expect rep movsl
to be faster than a sequence of 64-bit loads and stores. Perhaps not. (But the -Os
switch -- optimise for space -- does use a rep movsl
instruction.) Anyway, try this and let us know what happens.
Edited to add: I was wrong about the processor not allowing rep movsq
. Intel's documentation says "The MOVS, MOVSB, MOVSW, and MOVSD instructions can be preceded by the REP prefix", but it seems that this is just a documentation glitch. In any case, if I make Frobozz
big enough, then the 64-bit compiler generates rep movsq
instructions; so it probably knows what it's doing.