I have a legacy firmware application that requires new functionality. The size of the application was already near the limited flash capacity of the device and the few new funct
The above answers claim "Turning on compiler optimization [reduced the code size]". Given all the documentation and experience I have had in embedded systems TI DSP programming, I know for a fact that turning on optimization will INCREASE your code size ( for TI DSP chip ) !
Let me explain:
The TI TMSCx6416 DSP has 9 compiler flags that will affect your code size.
For my compiler, when you turn on optimization level three the documentation states:
What is software pipelining?
That is where the compiler will do things in assembly that make the for loops execute significantly faster ( up to a couple times faster ) but at the cost of greater code size. I suggest reading about software pipelining at wikipedia ( look for loop unrolling, prolog, and epilog ).
So check your documentation to make sure the optimization isn't making your code larger.
Another suggestion is to look for compiler flags that relate to code size. If you have code size compiler flags, make sure to crank up them up to the highest setting. Usually compiling for code size means your code will execute slower... but you may have to do that.
If you still need more space than with compile with optimizations
turned on, then look at the generated assembly versus unoptimized code. Then re-write the code where the biggest changes took place so that the compiler generates the same optimizations based on tricky C re-writes with optimization turned off.
For instance, you may have several 'if' statements that make similar comparisons:
if(A && B && (C || D)){}
if(A && !B && (C || D)){}
if(!A && B && (C || D)){}
Then creating anew variable and making some comparisons in advance will save the compiler from duplicating code:
E = (C || D);
if(A && B && E){}
if(A && !B && E){}
if(!A && B && E){}
This is one of the optimizations the compiler does for you automatically if you turn it on. There are many, many others, and you might consider reading a bit of compiler theory if you want to learn how to do this by hand in the C code.
Pay attention to macros. They can produce a lot of code from just one macro expansion. If you find such macros - try to rewrite them so that their size is minimized and functionality is moved to functions.
Pay attention to duplicate code - both copy-pasted and logically duplicate. Try to separate duplicate code into functions.
Check whether the compiler supports inlining and it could be turned off.
you can do lot of things but this two thing helped me a lot in the past I just want to suggest one
1-dont use general standard C library like sprintf,... they are very general and if you write your own function it frees a lot of space
2-If you have a local declaration of char array if you know the maximum length you should give the length explicitly instead of getting that by input argument e.g
if you have a function like this
void foo(char* str,uint8_t length){
char local_string[length];
....
}
you better find the maximum length you use then change it to
void foo(char* str,uint8_t length){
char local_string[MAXIMUM_LENGTH];
....
}
Generally: make use of your linker map or tools to figure out what your largest/most numerous symbols are, and then possibly take a look at them using a disassembler. You'd be surprised at what you find this way.
With a bit of perl or the like, you can make short work of a .xMAP file or the results of "objdump" or "nm", and re-sort it various ways for pertinent info.
Specific to small instruction sets: Watch for literal pool usage. While changing from e.g. the ARM (32 bits per instruction) instruction set to the THUMB (16 bits per instruction) instruction set can be useful on some ARM processors, it reduces the size of the "immediate" field.
Suddenly something that would be a direct load from a global or static becomes very indirect; it must first load the address of the global/static into a register, then load from that, rather than just encoding the address directly in the instruction. So you get a few extra instructions and an extra entry in the literal pool for something that normally would have been one instruction.
A strategy to fight this is to group globals and statics together into structures; this way you only store one literal (the address of your global structure) and compute offsets from that, rather than storing many different literals when you're accessing multiple statics/globals.
We converted our "singleton" classes from managing their own instance pointers to just being members in a large "struct GlobalTable", and it make a noticeable difference in code size (a few percent) as well as performance in some cases.
Otherwise: keep an eye out for static structures and arrays of non-trivially-constructed data. Each one of these typically generates huge amounts of .sinit code ("invisible functions", if you will) that are run before main() to populate these arrays properly. If you can use only trivial data types in your statics, you'll be far better off.
This is again something that can be easily identified by using a tool over the results of "nm" or "objdump" or the like. If you have a ton of .sinit stuff, you'll want to investigate!
Oh, and -- if your compiler/linker supports it, don't be afraid to selectively enable optimization or smaller instruction sets for just certain files or functions!
Compiler optimisation that triggers bug ? That's strange. Get a map of your program, and see if you should target data or code. Look for duplicated code. Look for code with similar goal. One example of it is the busybox code, which aims for small memory footprint.
It is favoring size over readability, so it sometimes get quite ugly, with gotos and so on.