A Quick Guide To The Keil Global Register Optimisation Facility
|< back

The register optimisation function of the C166 and C51 compilers is something that is (we think) unique to Keil. With the former CPU in particular, it can make worthwhile savings in code size and run time. Many C166 users never find this feature and are loosing out as a result. Put simply, the aim of this optimisation is to allow as many local variables as possible to be assigned to registers so that the very fast (80ns @ 25MHz) MOV Rw,Rw type instructions can be used.

Why Are Special Steps Required?

For any C function, the compiler will try to put as many locals into registers as possible. Starting from "main()", if it calls another function "f0()", the compiler has to assume that the latter will use all the registers for its own locals. Thus the compiler will not use registers at all for main()'s data. If f0() uses only one out of the 12 possible registers, then 11 will lie idle. Main()'s locals will end up on the USER STACK, where their access times will be slow.

In a more complicated situation where f0() in turn calls another function "f1()", the compiler will push the registers required for locals in the new function onto the SYSTEM STACK, restoring them at the end. In a worst case situation, only the leaf functions (i.e. those at the end of a function calling branch) will use registers for variables properly. The remaining functions will have to use the SYSTEM and USER STACKS for local data. As is often the case that these functions are very simple, overall, the system will be rather less efficient than it could be.

How The Optimisation Works

Global register optimisation makes sure that all functions in a branch get the chance to make best use of the available registers. To do this, the compiler needs to know exactly what each function in the system requires in the way of registers for local data. Unfortunately, this information is not known until the compiler has compiled all modules and even then, it is really only the linker which has a "global" view of the program.

Hence the global register optimisation is an iterative process, as follows:

(i) the linker scans each module and records the register requirements of each function and stores the information as a code or "MASK" in a new file with the extension ".REG". The REGISTER MASK is a 16-bit number which can be attached to a C function prototype to indicate what registers this function requires.

(ii) The compiler runs once again but this time examines the .REG file produced by the linker to see what registers are required by any function that is being called from the function currently being compiled.

(iii) The linker runs again and updates the REGISTER MASKS in the .REG file to reflect the total register requirement of a complete or partial function branch.

(iv) The compiler runs again, based on this new information in the .REG file, trying to get more local data into register.

(v) The compile-link-compile cycle will run up to 9 times, each time the linker reporting the PASS number to the user via uVISION.

 

Catering For Assembler-Coded Functions

The register usage of the user's assembler-code functions can be incorporated into the scheme by using the a manually-define REGISTER MASK in the function prototype included in C source files:

extern void asm_func0(void) @0x0010 ; // This assembler routine uses just R4.

The user can derive the correct MASK from this table:

On a simple test program with functions nested 4 deep from main(), the runtime was reduced from 36.2us to 31.3us, i.e. about 10%. No other C166 compilers has this capability and one reason why the Keil C166 compiler is the best one for high performance real time systems!

There are a number of things that need to be taken into account if you decide to use this powerful facility:

(i) On a large program, it might take 9 attempts to fully optimise the register allocation. This will greatly extend the program build time, although in these days of 233MHz PCs, this may not be a real problem.

(ii) A version of a program built with the optimisation is not the same as one without as this process materially alters the code. Therefore it should be enabled throughout the development period and definitely active during validation exercises prior to release.

The global register optimisation is enabled by ticking the box, as shown below: