What are some refactoring methods to reduce size of compiled code?

c optimization memory embedded size

22,724

Solution 1

Use generation functions instead of data tables where possible
Disable inline functions
Turn frequently used macros into functions
Reduce resolution for variables larger than the native machine size (ie, 8 bit micro, try to get rid of 16 and 32 bit variables - doubles and quadruples some code sequences)
If the micro has a smaller instruction set (Arm thumb) enable it in the compiler
If the memory is segmented (ie, paged or nonlinear) then
- Rearrange code so that fewer global calls (larger call instructions) need to be used
- Rearrange code and variable usage to eliminate global memory calls
- Re-evaluate global memory usage - if it can be placed on the stack then so much the better
Make sure you're compiling with debug turned off - on some processors it makes a big difference
Compress data that can't be generated on the fly - then decompress into ram at startup for fast access
Delve into the compiler options - it may be that every call is automatically global, but you might be able to safely disable that on a file by file basis to reduce size (sometimes significantly)

If you still need more space than with compile with optimizations turned on, then look at the generated assembly versus unoptimized code. Then re-write the code where the biggest changes took place so that the compiler generates the same optimizations based on tricky C re-writes with optimization turned off.

For instance, you may have several 'if' statements that make similar comparisons:

if(A && B && (C || D)){}
if(A && !B && (C || D)){}
if(!A && B && (C || D)){}

Then creating anew variable and making some comparisons in advance will save the compiler from duplicating code:

E = (C || D);

if(A && B && E){}
if(A && !B && E){}
if(!A && B && E){}

This is one of the optimizations the compiler does for you automatically if you turn it on. There are many, many others, and you might consider reading a bit of compiler theory if you want to learn how to do this by hand in the C code.

Solution 2

Generally: make use of your linker map or tools to figure out what your largest/most numerous symbols are, and then possibly take a look at them using a disassembler. You'd be surprised at what you find this way.

With a bit of perl or the like, you can make short work of a .xMAP file or the results of "objdump" or "nm", and re-sort it various ways for pertinent info.

Specific to small instruction sets: Watch for literal pool usage. While changing from e.g. the ARM (32 bits per instruction) instruction set to the THUMB (16 bits per instruction) instruction set can be useful on some ARM processors, it reduces the size of the "immediate" field.

Suddenly something that would be a direct load from a global or static becomes very indirect; it must first load the address of the global/static into a register, then load from that, rather than just encoding the address directly in the instruction. So you get a few extra instructions and an extra entry in the literal pool for something that normally would have been one instruction.

A strategy to fight this is to group globals and statics together into structures; this way you only store one literal (the address of your global structure) and compute offsets from that, rather than storing many different literals when you're accessing multiple statics/globals.

We converted our "singleton" classes from managing their own instance pointers to just being members in a large "struct GlobalTable", and it make a noticeable difference in code size (a few percent) as well as performance in some cases.

Otherwise: keep an eye out for static structures and arrays of non-trivially-constructed data. Each one of these typically generates huge amounts of .sinit code ("invisible functions", if you will) that are run before main() to populate these arrays properly. If you can use only trivial data types in your statics, you'll be far better off.

This is again something that can be easily identified by using a tool over the results of "nm" or "objdump" or the like. If you have a ton of .sinit stuff, you'll want to investigate!

Oh, and -- if your compiler/linker supports it, don't be afraid to selectively enable optimization or smaller instruction sets for just certain files or functions!

Solution 3

Refactoring out duplicate code should have the biggest impact on your program's memory footprint.

22,724

Author by

Judge Maygarden

foo

Updated on July 09, 2022

Comments

Judge Maygarden almost 2 years

I have a legacy firmware application that requires new functionality. The size of the application was already near the limited flash capacity of the device and the few new functions and variables pushed it over the edge. Turning on compiler optimization does the trick, but the customer is wary of doing so because they have caused failures in the past. So, what are some common things to look for when refactoring C code to produce smaller output?
Edi about 15 years

The only caveat would be that any of these processes in time-critical sections will warrant additional testing.
MbaiMburu about 15 years

If you disable inline functions, and turn macros into functions, aren't you increasing the runtime memory use (more function calls = new stackframes). I'm not sure about this stuff though.
user1066101 about 15 years

Using register variables (if available) for your inline and macro replacement functions can reduce your stackframe size down to just the return address.
Judge Maygarden about 15 years

RAM is not currently as tight as flash code space in my specific case, but I appreciate generic answers as well.
Adam Davis about 15 years

Still worth keeping in mind, as devinb points out some of these will increase RAM usage, and in the case of generation functions you're trading flash for ram (generate once at beginning, leave in memory for fast access)
Judge Maygarden about 15 years

In regards to variable size, a lot of the floating point math being changed to fixed-point would probably help a lot, right?
Adam Davis about 15 years

monjardin - Only if you got rid of all the floating point so the linker wouldn't link any floating point code. Once you add one floating point variable with a multiply, though, the FP multiply routine is included, and using it later doesn't incur as much a hit as the first one.
Mike Dunlavey about 15 years

@Adam: Nice work. I would only add that performance concerns are often misplaced, so one shouldn't worry about it unless it becomes a problem.
Judge Maygarden about 15 years

Looking at the compiler flags that option generates, disabling inlining and loop unrolling are among them. Those are actions already noted in current answers.
Trevor Boyd Smith about 15 years

Sounds like your problem is already solved. For my own curiosity, in your compiler turning on optimization resulted in disabling inline/loop unrolling/etc ?
Judge Maygarden about 15 years

Yes, it disabled inline and loop unrolling when set to optimize for size. When set to optimize balanced or for speed they were enabled.
Conor OG about 15 years

+1 The linker map file is the place to start. It will show you where the space is being used.
thegreendroid over 11 years

@AdamDavis Excellent answer! Can you please suggest a good book to learn about compiler theory?