January 28th, 2017 ~ by admin

Stratus: Servers that won’t quit – The 24 year running computer.

Stratus XA/R (courtesy of the Computer History Museum)

Making the rounds this week is the Computer World story of a Stratus Tech. computer at a parts manufacturer in Michigan.  This computer has not had an unscheduled outage in 24-years, which seems rather impressive.  Originally installed in 1993 it has served well.  In 2010 it was awarded for being the longest serving Stratus computer, then being 17 years.  Phil Hogan, who originally installed the computer in 1993, and continues to maintain it to this day said in 2010  “Around Y2K, we thought it might be time to update the hardware, but we just didn’t get around to it”  In other words, if it’s not broke, don’t fix it.

Stratus computers are designed very similar to those used in space.  The two main difference are: 1) No need for radiation tolerant designs, let’s face it, if radiation tolerance becomes an issue in Michigan, there are things of greater importance than the server crashing and 2) hot swappable components.  Nearly everything on a Stratus is hot-swappable.  Straus servers of this type are based on an architecture they refer to as pair and spare.  Each logical processor is actually made from 4 physical CPU’s.  They are arranged in 2 sets of pairs.

Stratus G860 (XA/R) board diagram. Each board has 2 voting i860. (the pair) and each system has 2 boards (the spare).  The XP based systems were similar but had more cache and supported more CPUs.

Each pair executes the exact same code in lock-step.  CPU check logic checks the results from each, and if there is a discrepancy, if one CPU comes up with a different result than the other, the system immediately disables that pair and uses the remaining pair.  Since both pairs are working at the same time there is no fail-over time delay, it’s seamless and instant.  The technician can then pull the mis-behaving processor rack out and replace it, while the system is running.  Memory, power supplies, etc all work in similar fashion.

These systems typically are used in areas where downtime is absolutely unacceptable, banking, credit card processing, and other operations are typical.  The exact server in this case is a Stratus XA/R 10.  This was Stratus’s gap filler.  Since their creation in the early 1980’s their servers had been based on Motorola 68k processors, but in the late 1980’s they decided to move to a RISC architecture and chose HP’s PA-RISC.  There was a small problem with this, it wasn’t ready, so Stratus developed the XA line to fill in the several years gap it would take. The first XA/R systems became available in early 1991 and cost from $145,000 to over $1 million.

Intel A80860XR-33 – 33MHz as used in the XA/R systems. Could be upgraded to an XP.

The XA is based on another RISC processor, the Intel i860XR/XP.  Initial systems were based on 32MHz i860XR processors.  The 860XR has 4K of I-cache and 8K of D-cache and typically ran at 33MHz.  Stratus speed rating may be based on the effective speed after the CPU check logic is applied or they have downclocked it slightly for reliability. XA/R systems were based on the second generation i860XP.  The 860XP ran at 48MHz and had increased cache size (16K/16K) and had some other enhancements as well.  These servers continued to be made until the Continuum Product Line (Using Hewlett Packard “PA-RISC” architecture) was released in March of 1995.

This type of redundancy is largely a thing of the past, at least for commercial systems.  The use of the cloud for server farms made of hundreds, thousands, and often more computers that are transparent to the user has achieved much the same goal, providing one’s connection to the cloud is also redundant.  Mainframes  and supercomputers are designed for fault tolerance, but most of it is now handled in software, rather than pure hardware.

Posted in:
Museum News

19 Responses to Stratus: Servers that won’t quit – The 24 year running computer.

  1. Stratus: Servers that won’t quit – The 24 year running computer | ExtendTree

    […] Read Full Story […]

  2. James

    Stratus the company still exists but in a significantly changed form. I’m sitting across the street from their new main office at the moment. In Maynard MA.

  3. Zurga

    “CPU check logic checks the results from each, and if their is a discrepancy”
    Oh come on!

  4. admin

    Thanks! I don’t proof read them as good as I should

  5. Jeff C.

    “lets face it, if radiation tolerance becomes an issue in Michigan, there are things of greater importance then the server crashing”

    Said like someone who has zero clue about radiation or it’s effects on electronics.

  6. admin

    Actually no, the statement was made to add a bit of humour to the post, it wasn’t the best place to discuss the differences in radiation hardening and radiation tolerance and the pro’s/con’s of each, however that would be a good post.

  7. Kevin

    Looks like there is an error on this page: http://www.cpushack.com/chippics/

    (Awesome site, by the way!)

  8. admin

    Oh interesting, that’s from the old gallery, I am guessing that its failed since the PHO version got changed recently.
    Have to see if I can fix that

  9. admin

    Yup, that was the problem, fixed.
    Thanks!

  10. Chris

    Hi Jeff,

    “… if radiation tolerance becomes an issue in Michigan, there are things of greater importance then the server crashing and…”

    Not to climb on the bandwagon, but it should be “…THAN the server crashing…”

    Hope this helps!

    Cheers!

  11. admin

    Not a problem, readers help make the articles better, and I am usually terrible with my then/than (I feel bad for people who have to learn English as a second language lol)

  12. me

    “Nearly everything on a Straus is hot-swappable. Straus servers”

    Straus?

  13. Mike de Boer

    s/Straus/Stratus/

  14. Stratus: Servers that won’t quit – The 24 year running computer. | thechrisshort

    […] Stratus: Servers that won’t quit – The 24 year running computer. from Tumblr http://chrisshort.tumblr.com/post/156581610650 via IFTTT […]

  15. TheRaido

    It could be like Theseus Ship 😉 Does any know how much of this server has been hotswapped/replaced during scheduled downtime?

  16. Steven ulbricht

    Little known fact… the very first generation of Bloomberg trading systems, build in collaboration with Merrill Lynch, relied on a Stratus-based communications front end… one of the very first Stratus applications rolled out into production.

  17. admin

    That would make sense, certainly an application where downtime would not be acceptable.

  18. Rafael

    Great article and great site! Congratulations!

  19. Shiunbird

    OMG, your page is going to ruin my work day, so much stuff to read!

Leave a Reply