Archive for the 'CPU of the Day' Category

July 26th, 2015 ~ by admin

Sun CoolThreads UltraSPARC T1 Sample

Sun UltraSPARC TI Marketing Sample

Sun UltraSPARC TI Marketing Sample

The Sun UltraSPARC IV consumed 105 Watts at 1350 MHz.  This for a dual core processor that could process 2 threads.  Sun decided that the T1 (aka the Niagra) was going to change that.  It was the first ground up redesign of the SPARC core since the UltraSPARC III.  Interestingly Sun originally first attempted to develop a multithreaded process by using a pair of UltraSPARC II cores on a single die.  That project was canceled in 2004, as the T1 was in development.

The T1 was designed to focus on maximum processor utilization.  It contained up to 8 cores, each of which could process 4 threads.  This allows the processor to be used more efficiently, as a single thread can not slow down the entire processor.  All 8 cores share a single Floating Point unit.  This worked well for most database type processing, as FP instructions are not very common in that type of computing.  The T2 (made on a smaller process) allowed for a FP unit for each core which allowed better performance in HPC applications.

Made by TI on a 90nm process, the T1’s 279 million transistors consume only 72 Watts, a 30% reduction from the UltraSPARC IV at a similar clock speed.  This is what Sun called CoolThreads Technology.  Released in November of 2005 Sun was a bit ahead of their time, lower power, more efficient processors were only just beginning to become an important selling point.  Interestingly, its sister project, the UltraSPARC Rk, turned out to be not so cool.  Today, 10 years later, energy efficiency is one of the key metrics when measuring processor performance.  With data centers having on average 50,000 computers, 30 Watts per chip adds up, quick.

July 16th, 2015 ~ by admin

TI SN74LS481: A Better Bit-Slicer

TI SN74LS481J -1980 - 8 MHZ 4-bit Slice

TI SN74LS481J -1980 – 8 MHZ 4-bit Slice

The 1970’s was a rush to design new and innovative processors, faster, more features, and more bits.  Most of the processors were new designs, a few were single chip implementations of older mainframes (such as the TMS9900 and the Intersil 6100.  At the same time there was a competition of 4-bit processors.  Somewhat remarkable in 1976 considering 16-bit designs were now being released.  The most famous was of course the AMD AM2901, which undoubtable won the battle.  There were others, the MMI 6701 (a company which AMD would go on to merge with).  Motorola had the MC10800, made in ECL and Intel made the ill-fated (probably since it was only 2-bits) Intel 3002 Processor.  TI made the SBP0400 in I2L that enjoyed some success, but that apparently wasn’t enough.  In 1976, the same year as the SBP0400, the 6701 and the AMD AM2901, TI released the SN74S481.  This was a Schottky TTL 4-bit slice processor (and the SN74S482 sequencer for it).  It was a bit different than its competition.

Read More »

June 11th, 2015 ~ by admin

Dallas: Reaffirming the Viability of the 8-bit Processor

The introduction of the Dallas Semiconductor DS87C520 reaffirms the viability of 8-bit processors for new and demanding applications.  Those were the words written about the the Dallas DS87C520 (and its ROMLess version the DS80C320) in 1994. The Intel MCS-51 architecture it was based on had been released 13 years prior, in 1981 and ran at up to 12MHz.  By 1994 the Pentium had been released, with speeds of up to 100MHz.  Full 64-bit processors were also available, yet the 8-bit processor continued to hold on, and grow.

Dallas Semi. was founded in 1984, by former Mostek employees.  Their first products were lithium battery backed SRAMs, a product pioneered by Mostek.  Dallas added power saving and sensing circuitry to them though, greatly enhancing their usefulness.  In 1987 they combined with with an MCS-51 microcontroller to make the DS5000, which ran at 16MHz and provided battery backed SRAM.

With the release of the DS87C520 in 1994 they redesigned the MCS-51 core, allowing it to complete a machine cycle in 4-clocks vs the original 12.  They were plugin compatible, providing a simple speed up for 8051 systems.  Max clock was also raised, to 33MHz as well as additional interrupts, 16K of EPROM, an extra 1KB of SRAM and many power saving features/modes.  Other companies (such at Philips, and Atmel) began to also make enhanced 8051s, including things such as Flash memory and expanded instructions/features.

Its now 2015, and the 87C520 continues to be made, as does hundreds of other MCS-51.  It was surprising in 1994 that the 8-bit processor continued to be viable, and perhaps to some, even more so, that 21 years later, it is still viable, and shows no signs of slowing down.  The recent push into the Internet-of-Things (IoT) market has 8-bit MCUs in Internet of Things yet again.  While many companies have marked numerous 16-bit and 32-bit designs as ‘a migration path from 8-bit’, that migration is yet to be seen.  There simply is no reason, no need, and no desire to plug a 32-bit processor in where an 8-bit processor, implemented in a few thousand transistors, will do nicely.

 

June 2nd, 2015 ~ by admin

MG80386SX: Pin counts: How low can you go?

Intel MG80386SX16 in a 88-pin PGA

Intel MG80386SX16 in a 88-pin PGA

Seeing this pin out, the first processor that comes to mind probably isn’t an Intel 80386.  The 80386DX came in a 132 pin package (PGA or QFP) and the 386SX came in a 100 pin QFP.  The 386SX was the low end version of the 386.  It made do with 16 bits of Data bus, and 24 bits of Address, as opposed to the full 32-bit buses of the DX.  This accounts for 27 less pins (16 Data + 7 Address, 2 data byte selects and a 16/32 bit pin).  That covers all but 6 of the difference in package sizes.  Where are the rest from?  As with most processors, the signaling pins are not the only pins used, or not used on a package.

The 80386DX has 84 signal pins, pins that carry information to or from the processor.  It also has 40 pins for power and ground.  In the early days, when processors had only 40 pins or less, it made sense, and was feasible to have a single power and ground for the entire chip.  As complexities increased, routing became harder, and it became easier to have multiple power and ground pins to the die.  Not to mention electrically more stable, as current requirements were also increasing.  In addition the 386DX has 8 pins not used at all.  These are known as ‘No Connects.’  They are reserved for future use, or were there for testing, or simply just not needed.

Intel 5962-9453301MXA MG80386SX16 - 16MHz 80386SX - 1996 Full Milspec

Intel 5962-9453301MXA MG80386SX16 – 16MHz 80386SX – 1996 Full Milspec

Moving to the 386SX, which has 26 less signal pins (58), the standard 100 pin package used 10 No Connects and the rest (32) for power and ground.  The pictured 386SX is a late production (1996) military spec processor in an 88 pin package.  88 pins still leave plenty (30 pins) for power, ground, and no connects.  The PGA 386SX was only produced for military/industrial uses.

Why use an expensive PGA package on a low end SX processor?  The reduced bus sizes were plenty for many industrial applications while the ceramic package was much more reliable, and mechanically strong when soldered on to a board then a plastic QFP.  The PGA could work over the entire military specification, for temperature, voltage etc.  Its likely the 386SX could run on an even smaller pin count, but the PGA88 package was a standard package already in production, which often dictates how many pins a processor will have.  The same is true today, pin-count is usually driven more by what works for the package, then what the processor actually strictly needs.

May 20th, 2015 ~ by admin

TI TMS7000: The SCAT Microcontroller

TI TMX70P81 - Early 8K Prototype. Never released

TI TMX70P81 – Early 8K Piggyback Prototype. Never released

The 1980’s brought many 8-bit microcontrollers to the market, such famous designs as the Intel MCS-51, the Zilog Z8, and the Motorola MC680x.  There were many others as well, including TI’s entry into the market.  After the race into the market with one of the first microcomputers, the 4-bit TMS1000, and the top of the line TMS9900 16-bit processor, TI saw the need to fill in the middle, the 8-bit market.  TI didn’t want to make the 7000 series just another 8-bit MCU either, they wanted something different, not so different as to be eccentric, but something to set them apart.  They did so with an innovation they called SCAT.

TMS7000 SCAT Layout. Notice the 'strips' that form the different sections of the MCU (click to enlarge)

TMS7020 (2K EPROM + 128 bytes RAM) SCAT Layout. Notice the ‘strips’ that form the different sections of the MCU (click to enlarge)

SCAT, Strip Chip Architecture Topology, was TI’s die layout design for the TMS7000.  Instead of generating each of the blocks for the chip (SLU, ROM, RAM, etc) making them as small as possible, and then using random logic to tie them all together, they laid them out in strips on the die.  The ROM in a strip, the RAM in a strip, and the ALU etc in another.  This allowed the sections to be wired up with a minimum of random logic, resulting in a smaller die, that was also easier to test.  More importantly it allowed the TMS7000 to be easily expanded.  Adding more ROM, or RAM didn’t require redoing the entire layout, it was just added to its respective ‘strip’.

Read More »

May 3rd, 2015 ~ by admin

AMD AM29501: 8-bits to the ByteSlice

AMD AM29501DC - 10MHz Byte Slice

AMD AM29501DC – 10MHz Byte Slice

AMD is well known for its 2901 bit-slice processor of the 1970’s (being made well into the 1990’s), as well as the previously detailed AM29116 16-bit processor released in 1981. However, the 1980’s brought another AMD design as well, though not as complicated, it is no less interesting.  In 1981, there was not a clear DSP (Digital Signal Processor) architecture, or really purpose built design.  The Signetics 8X300 was well suited for such work, but was not inherently designed for it.  DSP tasks were handled by other processors, or by completely custom designs.  The AM29501 was not designed as a DSP, but it was marketed as a signal processor, at least for the first 5 years of manufacture.  What the 29501 was, was a relatively fast, and pipelined, byte slice processor, basically a highly upgraded AM2901.

As the name suggests, the 29501 processes data 8-bits at a time, and as a slicer, it requires external program control (it lacks a PC (Program Counter) or sequencer).  It has an 8-function ALU, and 6 sets of registers, which can be accessed independently, allowing for a pipelined architecture, multiple instructions may be issued before the first one is completed (as long as they don’t need the same resources).  While the ALU is doing some addition, more data may be fetched, or output to one of the 3 8-bit buses. AMD designed the 29501 to be able to do advanced DSP work, and such work requires multiplication, which is something the ‘501 cannot do itself.  The 29501, however, was explicitly designed to interface to the AM29516/7 16-bit multipliers.   If a multiplication is needed the microprogram controller simply puts it on the multiplier bus and tells the 2951x to handle it.  A fairly advanced system could be built by using a 29116 a 29516 as well as a 29501, building a complete pipelined DSP system.  One of the first designs using the 29501 in such a way was a finger print recognition system, for matching images of fingerprints, a particularly intense DSP task for the 1980’s.

Read More »

April 18th, 2015 ~ by admin

The Forgotten Ones: Unisys SCAMP-D Mainframe

Unisys SCAMP-D - 1997 Made by LSI

Unisys SCAMP-D – 1997 Made by LSI

Burroughs Corporation started in 1886, making it the oldest computation company still in existence.  In September 1986, Burroughs merged with Sperry Corporation (of UNIVAC fame) to form Unisys which exists to this day. The story of the SCAMP though begins in 1961, with the introduction of the Burroughs B5000 mainframe.  Burroughs was a bit late to the mainframe market, but entered it with a computer that was rather ahead of its time.  The B5000 was a stack based design, and designed from the get go with the programmer in mind, it was designed with software implementation (namely the high level languages ALGOL and COBOL) in mind, rather then wrapping software around hardware design.  This made it easy to program, and thus allowed Burroughs to take customer from IBM, who released the System/360 shortly thereafter.

In 1969 the B6500 was released, improving on the design and with it the MCP (Master Control Program).  MCP was Burroughs operating system, and what came to define their machines for the decades to come.  The B6000 line (like the B5000) was a 48-bit architecture.  In addition to the 48-bit data size was a 3-bit tag that told the system what that data was, code, data, type, etc.  This simplified the instruction set greatly as instruction need not be specific to each data type, they could check the tag and know what type of data.  Coincidentally this also allowed for greater security as well, many of the buffer overflow exploits we have seen in the modern day were not possible on a Burroughs, the tag did not allow data to be executed as code, essentially it could perform as a NX flag (No Execute) such as is on modern x86 processors.

In 1984 the first A-series was released, as well as what would become ‘e-code’ a definition of the Burroughs instruction set that could be implemented in a variety of processors.  Like the DEC VAX, Burroughs wanted to clearly define the instruction set, and leave the implementation of it up to the hardware designers.  This helped ensure robust compatibility, and future proofing, ad is why MCP programs from the 70’s still can be ran today.

Read More »

Posted in:
CPU of the Day

April 9th, 2015 ~ by admin

The e2v PowerPC and HiTCE Packages

Atmel PC7410MGH450LE - Motorola Marked Package

Atmel PC7410MGH450LE – Motorola Marked Package – 2003

In the 1970’s second sources were quite important in the processor industry.  They provided a stable supply of a designed in part if the primary manufacturer (which often only had a fab or 2) had problems.  They also could widen the market for the processor.  Many of these agreements were kept active for decades after, resulting in some interesting results.

Motorola licensed many of their design to SGS, which later merged with Thomson to become STMicroelectronics. though the Thomson name was still used.  Thomson license built most of Motorola’s product line, as well as many high reliability versions.  In 1999 Atmel bought Thomson-CSF Semiconductors, and continued to make Motorola products (in their Grenoble France fab), which now included Motorola’s PowerPC line as well as the 68k line of processors.  This portion of Atmel was sold to e2v (in England) in 2006, which continued to produce the Motorola (now spun off as Freescale) PowerPC line, now branded as e2V.

The packaging used by e2v (and previously Atmel) is the same as that used by Motorola/Freescale.  The packages were custom made for Motorola/Freescale by Kyocera (and others) and so often chips with both Atmel/Motorola and e2v/Freescale markings can be found.  It is this packaging that is of interest, as it shows an interesting aspect of processor design.

Read More »

March 1st, 2015 ~ by admin

DEC Rigel: VAX Shoots for the Stars

DEC 78032 DC333R MicroVAX II - 5MHz

DEC 78032 DC333R MicroVAX II – 5MHz

DEC’s 32-bit VAX architecture saw many implementations since its introduction in 1977.  Early implementations were all multi-chip, but as technology improved the VAX architecture could be implemented (at least partially) on a single VLSI chip.  The first implementation on a single chip was the MicroVAX II released in 1985.  It contained 125,000 transistors, made on a 3 micron NMOS (DEC proprietary ‘ZMOS’) process and ran at 5MHz (200ns cycle time).

In 1987 DEC released the CVAX, the second generation VAX on VLSI.  The CVAX was made on DEC’s first CMOS process, a 2 micron design using 175,000 transistors and clocked from 10-12.5 MHz (80-10ns cycle time).  The input clock was a four-phase overlapping clock (so input frequency was 4x the cycle time, or 40-50MHz).  Performance was 2.5-3 times better then the MicroVAX II.  About half the gain was from process improvement (increased clock speed), while the rest was from architectural changes (mainly pipelining).

DEC DC580C 78034 CVAX+ 16.67MHz

DEC DC580C 78034 CVAX+ 16.67MHz

As the CVAX (and its successor the CVAX+) were released the next generation was already being designed by DEC.  This was to be Rigel.  Rigel has a 6-stage pipeline, and was made on a 2 micron CMOS process and the CPU contained 320,000 transistors, 140k of which were for logic, while the remaining 180k were for memory (cache). The separate FPU chip contained an additional 135,000 transistors.  After some early teething pains on the new CMOS process, where yields were almost non-existent, the process finally was refined enough to make commercial samples by late 1988.  The target speed for Rigel was a 40ns cycle (25 MHz clock).  This would give the Rigel a 6-8x performance gain over CVAX.  2X of this was from the process shrink (and doubling of clock speed) while 3X was from the improved pipelining.  The remainder was due to increased memory performance, not the least of which was due to Rigels 2KB of on chip cache.

Rigel, however, had other plans…

Read More »

February 22nd, 2015 ~ by admin

NEC SX-ACE: Quad-core Vector Supercomputing

NEC SX-ACE Processor Prototype - 2013

NEC SX-ACE Processor Prototype – 2013

When Vector computing is mentioned, the first company that comes to mind is Cray.  Cray was the leading designer and builder of vector supercomputers since the 1970’s.  Vector computing is a bit different then general purpose computing.  Simply put, a vector computer is designed to perform an instruction on a large set of data at the same time.  Such vector support has been added to x86 (in the form of SSE) as well as the PowerPC architecture (AltiVec) but they were not originally designed as such. Cray however, is not the only such company.  In 1983 NEC announced the SX architecture.  The SX-1/2 operated at up to 1.3 GFLOPs and supported 256MB of RAM per processor.  By 2001 with the SX-5 and SX-6 performance had increased to 8 GFLOPS and supported 8GB of RAM per CPU.  For a short while Cray themselves marketed and sold NEC SX computers.  Each of the processors, from SX-1 to the SX-9 was a single core processor, but with the SX-ACE, that changed.

Read More »