Ken Shirriff has an interesting article on reverse engineering the original ARM1 processor (as designed by ARM, and implemented by VLSI). He goes right to the silicon to form a transistor level model/emulator of the chip. Back in 1986 when the ARM was designed and released, it wasn’t very well known, being used in very few devices. This continued for over a decade surprisingly. being used in niche markets (the Apple Newton, the DEC StrongARM on RAID cards, etc). It wasn’t until the 2000’s that this processor startup from England became the powerhouse it is today. Two major developments drove this, mobile, and multimedia. The ARM architecture was powerful, small, and easy on the power budget, this obviously was a benefit for mobile, but also proved very useful in dealing with multimedia processing, such as controllers on DVD players, digital picture frames, MP3 players and the like. Today, hundreds of companies license and use the architecture and it is found in devices now numbering in the billions.
In January ST announced that they would be exiting the Digital Set Top Box (STB) market. This is a market that they arguably led for the last 20 years, and one that really began with their Omega processor in 1997. The ST Omega processor line, beginning with the STi5500 powered set top boxes, for cable companies, satellite companies, and DVR’s as well as other TV connected devices. Open up a satellite TV receiver from the last 20 years and you are very likely to find a STi Omega chipset.
The STi5500 was the beginning, and interestingly at its core was a ST20 processor, based on the Inmos Transputer (which ST now owned) from the late 1980’s. The Transputer was meant to revolutionize computing, making processors so cheap, that they could be embedded into pretty much any other logic device, what today we call an SoC, but in 1985, was a novel idea. At the time it didn’t really succeed, but ended up seeing its intended use 10+ years later in the Omega. In the 1980s the Transputer saw speeds of up to 30MHz, int he STi5500 it ran at 50MHz with 2K of I-cache + 2K of Data Cache as well as 2K of SRAM that could be used as data cache.
In the early 2000s the Omega was upgraded to a faster ST20 core, eventually hitting 243MHz in the STi5100, now with the caches increased to 8K each, as well as 8K of SRAM. This was getting to be the limit of the ST20 Transputer core. ST needed a core that could support higher speeds running such things as Java and Windows CE amongst other things, as well as support the higher resolutions and audio quality requirements.
ST handled this is in two entirely different ways. First they licensed the SH-4 32-bit RISC core from Hitachi, a rather surprising move but STBs was not a market Hitachi was in, so it was in both companies best interest. ST also was working on their own new core to replace the ST20, and they had help, from a very surprising partner.
Yesterday Microchip, makers of the PIC line of microcontrollers, announced they were buying Atmel, for a cool $3.56 Billion. This isn’t entirely surprising considering the ongoing consolidation in the industry, It was only last year that Dialog attempted to purchase Atmel, and before that ON Semiconductor and Microchip. In December of 2015 NXP and Freescale (formerly Motorola Semiconductors) merged, creating one of the largest microelectronics companies. These mergers do create an interesting result, product mixes that were formerly competitors, end up being marketed side by side. In the case of NXP and Freescale, NXP marketed many MCS-51 microcontrollers in their 8/16-bit lines, while Freescale of course sold many versions of MC6800 based MCU’s. These two rivalries have existed since the early 1980’s and likely will continue. Perhaps the biggest rivalry in MCU though is between Atmel and Microchip.
Microchip was spun off of General Instrument in 1987, but the PIC architecture dates back to 1976, and is still being made in nearly the same form (PIC16C55). Atmel was started in 1984, first making EPROMs, and then MCS-51 microcontrollers, one of the very first companies to make an 8051 with on die flash memory. In a bit of a twist of fate, when Atmel started, it was a fabless company, it contracted with several companies to make its EPROMs, including Sanyo, and General Instruments, which as mentioned above, became Microchip. Atmel also makes APRC processors, and for a time made Motorola products as well (Atmel has a very convoluted history, for more info on this read here and here )
Today the PIC line continues to be popular, with devices for the low end, such as the PIC10/12 all the way to the MIPS based PIC24 on the upper end. Atmel continues to make 8051 MCUs, but also makes the 8 and 32-bit AVR line, perhaps best known today for its use in Arduino boards. They also make MCU’s based on the ARM core, a competitor to MIPS, and Atmel’s own AVR32.
Likely to the consternation to many fans of either company, this merger does make sense, more so than ON or Dialog buying Atmel. While Microchip and Atmel both compete in the same markets, they do so with different architectures. Product lines are unlikely to change, and overhead saving should free up $$ both for stockholders (yawn) and engineering teams alike. No word has been giving yet on wether Microchip intends to keep the Atmel branding, but perhaps they should, as an AVR MCU with a Microchip logo on it may just prove to be too much for some.
The story of the Oracle SPARC M4 is best told starting with Afara websystems. Afara was the original developer of the SPARC processor that became the SUn UtraSPARC T1, aka the Niagara. Sun acquired Afara in 2002 in a sale that was really designed as a capital campaign for Afara, they had the technology and design for the processor, just not the money to enter the market, Sun had the money (or so they thought at the time). The T1 was released in 2005 and had 4-8 cores. The individual cores were called the SPARC S1 core (now an open source SPARC core). In 2007 Sun released the Nigara 2, the UltraSPARC T2, with 4-8 cores, based on the second version of the S1, the S2. Both the S1 and S2 were designed with multi-threading as the primary performance point. They excelled at it, and the UltraSPARC T3, released in September 2010 (though it had been sampling all the way back in Dec. of 2009) did even better at multi-threaded applications. The T3 also was fab’d by TSMC, a change from previous SPARCs which were almost entirely fab’d by Texas Instruments.
The T3, and the S2 core it was based on had one major problem. The S2 core had sub-par single thread performance. While the workloads given to a SPARC server can be tailored somewhat to match was the processor does best (multi-threading) there is always going to be a point at which a single thread task must be done, and it will hold up the entire processor if it cannot be processed efficiently.
In the 1960’s the Dutch Philips Data Systems marketed computers from Honeywell. By 1970 they decided that simply reselling others machines was not the best value for them, or their customers and set off to design their own series of mini computers. The first design was the 8-bit P410, which only saw limited success, it was a bit too mini for the early 1970’s when 16-bits or better was the standard. 1970 saw Philips begin work on its successor in Fontenay Aux Roses, near Paris, France, a project known internally as Sagittaire. It was released in 1971 as the P800 series of mini computers, starting with the P850. These were a 16-bit design, using 16 16-bit registers. It shipped with 2k x 16bits of memory and had a cycle time of 3.2 microseconds (~312KHz). Further versions were released that supported up to 32k x 16bits of memory and faster cycle times.
The P800 architecture used the A0 register as the Program Counter and the last register (A15) as a stack pointer. The design supported up to 64 I/O devices and 64 interrupt levels. The addressing modes include direct, register, indirect, indexed and indexed indirect types and can operate on bits, bytes (characters), words, and double words. Since the stack is maintained in memory, the stack pointer can be rewritten, preserving the current stack for easier context switches. This is of course important as the P800 is designed as a multi-user. multi tasking computer. The P800 instruction set included 97 instructions, including MULT/DIV, though depending on the model, some of these were simulated (microcoded). The P800 family found wide use in offices and eventually banks (always the big money market) throughout Europe. It also proved to be useful in industrial environments, a somewhat underappreciated market for mini-computers at the time.
In 1979 Philips released the P851, a Single Board Computer (SBC), version of the P800 series. It included the full 32k words of memory and was an LSI implementation using 5 Philips LSI’s consisting of 4 4-bit ALUs and a control path. The P851 was used extensively for industrial automation as well as Philip’s own PM4400 computer system. This system became the basis of the PM4421 development system which supported development and emulation of many processors, including the Intel 8085/86/88, Zilog Z80, 650x, Motorola MC68k, Signetics 2650 and many others.
The P851 LSI design was also used in space missions, perhaps the most famous in the IRAS mission launched in 1983. This was the first full Infrared mapping mission launched, and in its 10 month mission, mapped almost the entire sky in 4 different IR wavelengths, IRAS Space Discoveries that are even today not yet identified. The mission was of course limited by the coolant carried to keep the IR detector cold, but the IRAS satellite continues to orbit Earth to this day, with a 16-bit P851 computer still on board.
By 1982 Siemens has firmly established themselves as a semiconductor powerhouse in West Germany, and the entirely of western Europe. Their manufacturing prowess led them to be Intel’s second source of choice in Europe, building 8008,8080, and 8086/8 processors, with production beginning for the 186 and 286s processors as well. Siemens’ expertise was not just in making second sourcing others work, they had their own design/development as well, doing a large amount of work for the industrial automation market as well as others.
In late 1982 they announced a new 16-bit processor, one of their own design. Production began in 1983 and continued for over a decade. The 80199 had a 8086 compatible bus, but that’s where the similarities end. The 80199 is often described as a ‘Terminal COntrol Processor’ or a ‘Printer Controller’ which is a bit deceptive. It was designed from the outset as a real time processor, capable of handling multiple real time tasks.
The SAB80199 was built on a 3 micron NMOS process and contains 40,000 transistors on a 45mm2 die. Clock speed is 20 MHz (faster then most anything else in 1983) and had an instruction cycle of 0.5 microseconds. It moved many of the RTOS functions from software (or an external chip like Intel’s 80130 RTOS co-processor for the 808x) to on chip hardware. It had 8 status registers, 8 instruction pointers, and 8 sets of registers. This allowed very rapid task switching as each tasks data did not have to be saved/restored, a complete task switch took 1 microsecond to complete. In addition the 80199 had another feature that was rather novel at the time, cache. The processor contained an on chip instruction cache the could hold 16, 16-bit instructions. For some sets of code, such as a simple loop, the entirely of the instructions for it, would reside on chip, resulting in very fast execution. Today of course caches for data/instructions are normal, and very large, measured in KB and MB but in 1983 it was virtually unknown.
In 1983 the ‘West Europe Report’ called Siemens 80199 the ‘Fast Bavarian’, fast indeed, and it was adopted across Europe, but never made it to the American market in any quantity. It is perhaps one of the ‘forgottens’ but certainly deserves a place in the history of real time computing.
Akatsuki, Japanese for Dawn, was launched in May of 2010 for a journey to the morning star, Venus, on a JAXA H-IIA rocket. The H-IIA flight computer runs on a space rated version of the NEC V70 32-bit processor, running the NEC RX616 RTOS. A processor significantly faster than that of the interplanetary probe it was launching.
“it will have a short cruise to Venus, entering its long, elliptical orbit in December. Its mission should last several years. “
In space, things don’t always go as planned…
On December 7th Akatsuki entered orbit around Venus, December of 2015 rather than 2010. Due to a valve in the fuel pressurization system not opening all the way the orbital insertion engine ran much too lean on its attempt to enter orbit, causing it to overheat and catastrophically fail. This left the probe on a heliocentric orbit, moving away from Venus. The Japanese Space Administration (JAXA) was not deterred, Akatsuki’s orbit would eventually meet up with Venus again, almost exactly 5 years later. JAXA determined they could use the probes attitude control thrusters, which feed off the same fuel tank as the failed main thruster, to insert Akatsuki into a highly elliptical, yet still useful orbit. Had the Attitude control system used a separate fuel system (which is actually the more common design method) this would not have been possible, as it would take a relatively large amount of fuel, fuel that was available on Akatsuki due to the main engine failing and being shut down before its burn was completed. It should be noted that such a maneuver had never previously been even proposed, let alone attempted. There was however another small problem…
In 1983 memory products were still Intel’s largest source of revenue. Intel’s first product, the 3101, was a RAM, and until the memory trade wars of the early 80’s continues to be Intel’s bread and butter. Fab 5, opened in Aloha, Oregon in October of 1978 and its primary product was memories. EPROM’s, EEPROM’s, SRAM, and DRAM were all fab’d here, then shipped overseas, and back to Oregon for testing. The primary testing facility for the Memory Products division was the T-5 site in Hillsboro, just a few miles from Fab 5. T-5 tested both commercial, and military memory products up until 1985, when Intel exited the DRAM market in its entirety.
These OPEN HOUSE sample chips were handed out to employees and visitors at the test site during its annual open house in 1983 (apparently in many of the open houses at that time). Most likely this chip is a 2186A integrated RAM, a 64K DRAM made on a 1.2 micron HMOS-III process. The 2186 was a new design for 1985 and provided a DRAM with the same pinout as a 2764 EPROM.
Just like T-5, Intel DRAMs are no more, though the Fab 5 they were made in, which was closed in 1998, was reopened to increase Flash production, the only memory product Intel still makes. Intel’s exit of the DRAM business was certainly a risky decision back then, but it turned out to be one of the best they made. They blamed the exit on the rapidly falling prices do to ‘dumping’ of DRAM’s and EPROMs (sold below cost) from Japanese semiconductor companies, but this allowed them to exit the DRAM business before DRAM’s turned into the commodity they are today, with margins being almost non-existent. This allowed Intel to focus time, resources (fab capacity was in very short supply then) and money on other products, namely microprocessors and microcontrollers, they very products that have taken Intel from a one of many semiconductor company to world leader. Perhaps they can thank those same Japanese companies they were so upset about back in 1985 for where they are today.
Before the single chip processor, the Intel 4004, TI TMS1000, or Four Phase AL-1 (depending on your school of thought) ‘processing’ was done by discrete logic. These are SSI IC’s (Small Scale Integration), a step up from literal discrete transistors, each IC contains 2-30 transistors, implementing a couple gates.
The most famous of these is the TTL (Transistor-Transistor Logic) series developed by Sylvania in 1963. Before TTL though their was RTL (Resistor-Transistor Logic) in 1961 and the next year, DTL (Diode-Transistor Logic), whereby Diodes were added to the inputs, allowing much better fan-in. Neither of these designs had great noise immunity, which in many applications was very important. Motorola patented a modification to DTL in 1966 with production of the new MHTL family commencing in 1967-1968.
MHTL, Motorola High Threshold Logic, was designed for environments where high noise immunity was a must. Noise, really any voltage that is present, and not wanted/not an actual signal, can be complicated to deal with. Motorola’s solution was to make the signal much larger, this s clearly the ‘bigger hammer’ approach to noise. Normal DTL has a turn on voltage of 1.5V (0-5V Logic). fairly low, and in an industrial environment, where these IC’s may be controlling large motors and solenoids, a common noise voltage. MHTL raised that to 7.5V, requiring a 15V supply. Speed suffers greatly, as the voltage must now swing from 0-15V for a logic 0 to a logic 1 on the outputs, 3MHz being a typical max compared to 40MHz for Motorola’s DTL. It should be noted, that as fast as that sounds, it’s only for a few gates, a full board of these will not be able to attain anything close to 3MHz due to propagation delays through the many IC’s.
The pictured MHTL devices are:
|MC660||Exp 4 Input NAND (Passive Pullup)||6||88|
|MC661||Exp 4 Input NAND (Active Pullupt)||4||88|
|MC662||Expandable 4-Input NAND Line Driver||6||180|
|MC663||Dual J-K Flipflop||24||200|
|MC665||Triple Level Translator (for interface to DTL, RTL or TTL)||??||104|
|MC666||Triple Level Translator||??||105|
|MC667||Dual monostable multi vibrator||??||240|
|MC668||Quad 2-Input NAND Gate (Passive pullup)||8||176|
|MC670||Triple 3-Input NAND Gate (Passive pullup)||6||132|
|MC671||Triple 3-Input NAND Gate (Active pullup)||9||132|
|MC672||Quad 2-Input NAND Gate (Active pullup)||12||176|
|MC673||Dual 2-Input AND-OR-INVERT (Active pullup)||??||160|
|MC675||Dual Pulse Stretcher/Multivibrator||??||180|
Today, noise immunity is still relevant, and much much more complex than simply increasing the supply voltage. Higher supply voltages not only slow down switching, but they also increase power draw significantly. The MC660 pictured has exactly 2 gates (4-input NAND), consisting of 6 transistors, and still dissipates 88mW. That would be the equivalent of an Intel 4004 dissipating 12 Watts, or an Intel 386 needing about 4 Kilowatts. Modern noise immunity is handled by adding additional transistors (keepers, pre-chargers, etc) that can keep gates from being affected by noise, whether it’s from power/ground lines, leakages, or other reasons. This allows chips with millions of transistors to operate at sub 1 Volt levels. An impressive feat.
In early 2004 Sun Microsystems had a lot going on. The UltraSPARC IV had been announced, and Sun was already talking about its upgrade, the UltraSPARC IV+. Sun had recently released the Jalapeno, aka the UltraSPARC IIIi, their second processor with on die L2 cache (The first being the IIe designed for embedded use) in 2003. In 2002 Sun had purchased Afara Websystems for their SPARC design, known as Niagara, which became the Sun T1, and were working on its successor, the T2. Both the T1 and the UltraSPARC V (the successor to the not even itself yet released IV) was scheduled to tape out the next year, yet itself was canceled in April of 2004, most of the entire engineering staff working on it is laid off.
At the same time Sun was talking up an upgrade for the lower end UltraSPARC IIIi, this would be a relatively simple process, more the existing core to a new process. It currently was being made by TI on a 130nm 7-layer Cu interconnect process with low-k dielectric. Moving it to TI’s 90nm process would allow for greater clock speeds, less power, and room on die to quadruple the L2 cache to 4MB. The processor was code named Serrano, and widely announced as an upgrade to Sun’s Fire V215, V245 and V445 servers. Sun promised a release in late 2005. And then…
Nothing, talk of the Serrano went silent, all PR focus has shifted to the coming T1 and the UltraSPARC IV+. Both are released in 2005 to great applause, but the tech community is still wondering where the IIIi+ has gone? Sun isn’t exactly forthcoming as to why, mentioning that it had been delayed in order to get the T1 out the door. In mid-2006 a customer commented, “There have been problems getting the UltraSPARC IIIi+ processors, so the new systems will be released with the current chips.” Finally in August of 2006 Sun come forward and says that the IIIi+ has been canceled, but there is a catch, it was canceled the year before, and Sun decided to just keep mum about it.
Keep in mind the IIIi+, other then the increase in L2 cache, was a fairly ‘routine’ port to a new process. The delays, and cancellation at the time sounded like it was due to technical grounds, but looking back, and seeing that they had working silicon in 2005, it would seem that the decision to kill the Serrano was resource driven. Likely a combination of Sun’s engineering and marketing constraints, as well as the availability of the 90nm process at TI, which was also being used for the Niagara.
Manufacturing capacity is a finite resource, so not using up what may have been a very limited amount of fab space, on a processor that was designed to slot into the low end servers, is possibly the best explanation we have for the cancelling of the UltraSPARC IIIi+, perhaps a former Sun engineer can fill in some more details, as so many of them were laid off whom had worked on Sun’s previous processors. It was a gamble by Sun, and one which seems to have paid off, considering the success of the Niagara, though Sun/Oracle were far from done with canceling designs, Honeybee, Rock, and M4 all come to mind.