# Chip of the Day: TRW MPY-16AJ – Making Multiplication Manageable

In Primary School students are tasked with memorizing their multiplication tables. Taking the time to manually calculate 6×5 is much slower than simply committing the result to memory. This allows more complex math to be processed quicker as the students skills develop. Typically this is limited to numbers up to 12×12, resulting in 144 results to ‘store.’ In computing the same can be done. A ROM can be used as a lookup table for multiplication. The problem is it does not scale well. Handling 4×4-bit multiplication requires a 256×8 ROM (2^{m+n} addresses and m+n outputs). This could be handled by a many ROMs available in the 1970’s. Anything more than 4-bits though was simply not possible. This gave rise to the need for multipliers to calculate the result.

This was a problem TRW set out to rectify in 1976. TRW LSI Products was formed in the 1960’s to commercialize the transistor products that had been developed by Pacific Semiconductors, a division of TRW. It was James Buie who invented the TTL logic gate in 1961 while working for TRW. TTL went on to become the logic standard throughout the 1970’s and 80’s. TRW was involved in aerospace, helping design planes, satellites, and missiles, fields that required processing of signals data, what became known as Digital Signal Processing (DSP). In the 1980’s processors were designed to handle this, such as the TI TMS320 series, but in the 1970’s it had to be done with discrete components. DSP systems had several needed blocks, Fast ADCs, ALUs, and multipliers. TRW invented fast ADCs to handle the inputs, and ALUs were available such as AMDs Am2901 or even the TTL series 74181s. Multipliers however were not widely available, especially for large bit-widths.

TRW’s first multiplier was a custom device to work with their own avionics processing system. It was made on a Bipolar process, and multiplexed the entire product, using around 40 pins total (the entire product was multiplexed with the operands). It could handle a multiply in 330ns worst case. Interestingly yields of the device were considered ‘excellent’ at 3 working devices per wafer (out of 19 per wafer (most likely a 2″ wafer)). Today, yields like that would be completely unacceptable.

TRW designed the MPY-16AJ as a brute-force 16×16 multiplier. It was designed on a Bipolar process with around 3600 gates. It implements a series of AND gates and CARRY-SAVE-ADDERS to implement the multiplication. There are faster methods, but they come at the cost of complexity and power draw. As designed the the MPY-16AJ dissipates 5 Watts while handling a signed (2’s complement) multiplication in a worse case 230ns). They MPY16 was packaged in a large 64-pin package to limit the # of pins that had to be multiplexed. The lower 16-bits of the product are multiplexed with one of the operands. This is acceptable as in many applications the upper 16-bits of the product are sufficient accuracy. The 64-pin package allowed for not less multiplexing, but also a much larger surface for heat dissipation. A heatsink was also affixed to the package as well.

Later versions of the MPY-16 added support for unsigned multiplication as well (the MPY16H) and became the standard for 16-bit multipliers. Compatible multipliers were made by Analog Devices (ADSP1016, 40-50ns at 150mW) and LOGIC LMU16/216) in CMOS, by Weitek (WTL1516/A/B, 50-100ns at 0.9-1.8W) in NMOS, by Synertek (SY66016 100ns at 1.5W) in HMOS, by AMD (Am29516 38ns at 4W) in ECL, as well as many others. These were implemented internally with different processes, and different multiplier algorithms but externally they all mimicked the standard TRW MPY16J and served as the basis of many signal processing and high end math computers. As a testament to their usefulness, the MPY16 was also copied by the USSR as the 1802VR5. The TRW MPY16 was last made in the mid-1980’s but its clones continued to be made into the 1990’s. Today its functions can be handled by any DSP, CPU or even coded into a FPGA, but for a time, the MPY16 multiplied the efficiency of many processing systems.