| Quart Into A Pint Pot? Grafting a C161 into an 8032 socket..... |
After 6 years development, not only was the software for the modem/terminal unit in question well over the natural 64k code size barrier but the need to add new real time features threatened to completely overload the 16MHz 8032 currently being used. The 350kb of code and constant data was held in two bank-switched 27C020 256kB EPROMs and used P1.0 to P1.2 as addition address lines. The need to control 4 separate modem channels in the new application required a very fast switch between the individual tasks serving each channel, something that the poor old 8032 really was unlikely to be able to handle. After initially trying the internally clock-tripled DS320 and the MCS251 as simple plug-in upgrades to solve
the speed problem, it was decided that something more radical was needed to
overcome the lack of memory space....
Despite the fact that the 8032 was in a 44PLCC package and the C161 is a 80PQFP, it was decided to physically graft it into the existing socket to assess its suitability for the upgraded product..... C161 fans will know that it is a true 16-bit processor and is usually used in 16 bit bus systems. However, if needed, it can be used in a normal 8032-style 8-bit multiplexed bus design and this was what was done.
The C161 was mounted on a special chip carrier with a PGA footprint which was
soldered into some prototyping board. On an existing modem board with a 44PLCC
socket fitted, a special male 44PLCC block was obtained which was wire-wrapped
to the C161 chip carrier. Obviously, with 80 pins on the C161, it was not possible
to wire them all across to the 8032, however pins like the
INT0 and INT1 inputs were taken to P2.9 and P2.10 on the former as these can
produce interrupts. The C161 serial port design was based on the 8032's so the
appropriate pins were simply wired across.
In 8 bit multiplexed mode, the C161's lower 16 bits of the bus are entirely on port 0 with A16-A19 on port 4. Thus port 0 was wired pin-for-pin from the C161 to socket whilst the lower 3 bits of port four (A16-A18) were taken to the P1.0 - P1.2 position on the socket. The pull-down resistor required to set the 8 bit multiplexed bus mode was attached to port 0 on the C161 carrier. The 8032's Harvard achitecture means that the EPROM and RAM are effectively at the same physical addresses, something that at first caused some head-scratching as the C161 has a linear von-Neuman memory space.
The solution was quite simple: when the C161 first comes out of reset, it puts
the chip select 0 line (/CS0) low and emits address zero. This signal roughly
corresponds to the /PSEN on the 8032 and so /CS0 was wired to /PSEN. To make
/CS0 more like /
PSEN, the C161 software reconfigured the pin to act as a read chip select so
that it would only go low when the C161 attempted to read program code from
the EPROM. This technique effectively gives the C161 an extra /RD line that
is only active over a software-programmable memory range. As ALE on the two
CPUs exactly correspond, they were linked directly together.
The XDATA RAM and memory mapped DUART were a little more difficult in that the original design relied on A15 to enable the RAM over the first 32k and thus mapped the DUART at X:0x8000. The /RD and /WR lines served to enable both devices. Here, the /CE lines to the RAM and DUART had to be isolated from A15 and attached to the chip select 1 (/CS1) and chip select 2 (/CS2) on the C161 respectively. In C161 software, /CS1 was mapped to 0x80000 over a 32kb range so that it appeared to be just above the 512kb EPROM block, with the DUART mapped to 0x88000 over a 4kB range. One problem which soon reared its head was that the faster /RD and /WR cycles on the C161 meant that the DUART could not be reliably accessed. The solution was to use a software-driven register (BUSCON2) to introduce a waitstate over the chip select 2 memory range for the DUART but still leave the RAM running at full speed. As the C161 allows differing bus characteristics in each of it four chip select ranges, this was not a problem. The /RD and /WR on the C161 were then taken to the corresponding positions on the 8032 socket.
The clock lines were very straightforward as they were simply wired one-for-one to the C161. The reset was more awkward in that the 8032 has an active-high reset and the C161 is active low. Unfortunately, a spare invertor was easily pinched from the existing 74HC04 to invert the reset from the 8032 socket. The original 11.059MHz clock was swapped for a 12MHz part just to keep the C161's timers running at a (vaguely) sensible 666ns per count rather than 723ns per count! As the C161 has a proper programmable baudrate generator, the clock speed could be more freely chosen than before.
After some fiddling and with the aid of an in-circuit emulator, the C161 was coaxed into life using a simple test program which set up the chip selects and waved P2.11 up and down just to give something tangible to see on a `scope.
Despite the huge size of the software, the port to the C161 was not particularly difficult as most of it consisted of huge look-up tables and text strings. The register names of many C161 peripherals are either identical to the 8032 or at least similar enough to make the renaming easy. Timer0 had been used in the original to generate a system "tick". In the C161, T3 in reload mode from T2 was used to quickly implement this function. The professional manner in which the software had been written greatly assisted the general porting, much to everybody's relief, as did the fact that the C51 and C166 compilers came from the same manufacturer (Keil).
As part of an attempt to speed up the original 8032 code, some assembler had
been used to code one of the interrupt routines. While there are some superficial
similarities between 8032 and C161 assember mnemonics, the job of translating
was no looked upon with relish. Fortunately, an 8032
to C161 code translator utility was found to be available from a German university
which did manage to convert the small 8032 interrupt routine into something
that would assemble under A166. Even more suprising was that it executed correctly
once converted!
The hardest part was converting all the special 8051 pointer types. As extensive use had been made of C51 compiler's memory-specific pointers, some editing was required to change to remove all the special 8051 keywords. With the C161 compiler's HLARGE model, no special steps are necessary to increase pointer efficiency, although the HOLD control can be usefully employed to improve access times to frequently accessed objects by putting them in the on-chip IDATA RAM.
One curious side-effect of the huge increase in performance was that some dubious 8051 code had relied on the leisurely runtime of the 8032 to introduce delays into some user-operated input key functions which suddenly require lightening reaction times from the operator to keep up!
After about three days, the CPU transplant was up and running, with most software functions working. The code size was about 10% smaller than on the 8032, which was something of a suprise as the C161 has mainly two and four byte instructions. Some simple performance measurements revealed that overall, the C161 version was about 8 times faster than the 8032 for roughly the same clock speed but specific sections that involved memory copying and 32 bit maths were about 11 times faster. The overhead of bank-switching on the 8032 design tends to magnify the differences between the two versions. The addressing of objects over a potentially 4MB range does not introduce any software overhead. Oddly, the current consumption was slightly less. The use of 150ns EPROMs in the 8032 realistically limit the clock speed to about 12MHz however, in a real C161 design, the 8 bit non-multiplexed mode would be used with the result that the 573 address latch could be dispensed with and the CPU would run about 25% faster.
The result of the exercise was that the board was re-layed out for the C161 with a 12MHz clock and no address latch. The overall hardware design is now simpler than before and the software is considerably easier to follow as the all the contortions required for bank-switching have been removed. The opportunity was taken to add space for 1MB of FLASH EPROM. The C161's integral bootstrap mode allows it to boot up with a blank EPROM which it can then program via a serial port download. The 32kb RAM was increased to 128KB for "future expansion". Although the C161 can run at up to 16MHz and use a proper 16 bit wide bus, the increase in performance over the 8032 was judged adequate for the foreseeable future. The possibility of quadrupling performance by using the C161 to its full was left in reserve to give a very wide safety margin to meet unforeseen requirements.
A higher resolution PDF version of this document is available