HP Convex SPP2000 S-Class, X-Class
|Caches||1/1 MB L1|
|Bandwidth||CPU 7.5 GB/s
Mem 15 GB/s
I/O 1.9 GB/s
XBAR 15.3 GB/s
SCI 3.8 GB/s
The HP 9000 S-Class and Convex Exemplar SPP2000 are large scalable PA-RISC computing servers and the direct predecessors of the later HP V-Class (V2200, V2500 et al).
Originally developed by Convex, the SPP2000 and later S-Class
are based on a crossbar architecture with the central internal
component connecting the resources to each other by forming matrix connections between
the devices’ input and output ports.
A single SPP2000 computer can hold up to sixteen 64-bit PA-8000 processors with 16 GB of memory in a single Node — called S-Class. The SPP2000 can form a large-scale system by connecting single Nodes with SCI links into a larger cluster of up to 32 nodes/512 processors. The resulting interconnected system are called X-Class, and are are ccNUMA computers. The clustering capabilities of their successors, the V2500, have been reduced significantly — in contrast to the 32-node maximum of SPP2000 clusters, V2500s only can be clustered to groups of four.
As the other Exemplar systems, the SPP2000/S-Class are operated and controlled via
teststations, Unix workstations that connect to a central management
board in the single nodes which provides booting, system monitoring and diagnostics,
and console connections.
These teststations were either IBM RS/6000 AIX systems or later, more common, HP 9000
workstation running HP-UX.
- SPP2000 Node/S-Class: 4-16 PA-8000 180 MHz with 1/1 MB off-chip I/D L1 cache each
- SPP2000 Cluster/Wall/X-Class: 32-512 PA-8000 180 MHz with 1/1 MB off-chip I/D L1 cache each
The SPP2000 is based on the Exemplar crossbar architecture which connects the CPU and I/O to the system main memory.
- 8x8 nonblocking crossbar
is the central part of the system, it connects the memory to the processor buses and I/O channels.
There are eight ports for
agentsfor CPUs and I/O — each agent connects to two CPUs and one I/O channel —, and eight ports for memory. Each crossbar port has a path width of 64-bit, giving it 960 MB/s peak bandwidth. The peak bandwidth of the crossbar is 15.3 GB/s combined. The crossbar in the original SPP1x00 Exemplar design was built with GaA chips, the SPP2000 in standard CMOS with 1.1M transistors.
- Eight Data Mover/Agents attach to the crossbar and provide access for the processors with Runway buses and I/O controllers to the memory via the crossbar over a 1.9 GB/s datapath with four 32-bit, unidirectional buses from two ports on the Agent connect to two crossbar ports. The I/O channels on the agent have a maximum bandwidth of 240 MB/s. Each Agent has two Runway processors buses with an aggregate bandwidth of 960 MB/s.
- Eight PCI controller connect the 240 MB/s I/O channels/PCI buses to the Agents.
- Eight Memory controllers attach each one four-way interleaved memory board to the Hyperplane crossbar. Each Memory controller has a bandwidth of 1.9 GB/s. The memory controllers probably also interface with the CTI interconnection.
» View a system-level ASCII illustration of the crossbar architecture.
- Total crossbar bandwidth 15.3 GB/s (intra-crossbar)
- CPU bandwidth 7.5 GB/s (CPU-to-Agent, eight Runway 960 MB/s buses)
- Memory bandwidth 15 GB/s (memory-to-crossbar, sixteen 960 MB/s links)
- I/O bandwidth 1.9 GB/s (eight 240 MB/s channels, I/O channel-to-Agent)
- Eight PCI-32 I/O buses for expansion slots (each 240 MB/s)
- Attachments to SCI rings/CTI (
Coherent Toroidal Interconnect) via two rings (X-ring and Y-ring), Node-to-Node bandwidth of 3.84 GB/s, the rings operate at a clock of 120 MHz with a width of 32 bit
- SCSI-2 Ultra main storage I/O bus
- SDRAM DIMMs
- Two to eight memory boards per node
- Memory is up to four-way interleaved per memory board and up to 32-way interleaved per node
- SPP2000 Node/S-Class: 1 GB minimum, 16 GB maximum
- SPP2000 Wall/X-Class: 512 GB maximum (with 32 nodes)
- 24 PCI 32-bit slots on eight PCI 32-bit channels
- 20 internal Ultra SCSI drives
Multiple Exemplar SPP2000/HP S-Class systems can be connected together to form a single large system,
- Up two 32 single nodes can be clustered together to form a system with up to
- 512 processors
- 512 GB of RAM
- 768 PCI slots
- 640 SCSI drives
- Clustered SPP2000s/X-Class are ccNUMA computers; they are not fully conformant to the PA-RISC 2.0 specification (and thus do not run standard HP-UX).
- Multiple systems are connected via two CTI rings: these links attach
to the eight memory controllers of a node.
A single system attaches to other single
nodesand their respective crossbars with a node-to-note data rate of 3.8 GB/s.
- The two rings are called X-ring and Y-ring.
- The links are implementations of the IEEE SCI from Convex — Convex Toroidal Interconnect.
- Each node’s main memory is globally accessible from other nodes on the CTI network (that is, local memory is globally shared).
- A part of each system’s main memory is reserved for cache memory for the CTI network (configured statically at boot time).
- 68-pin VHDCI Ultra LVD external SCSI
- Three Serial RS232C DB9 (local console, remote console, general purpose) via a
- 10/100 Mbit Ethernet TP/RJ45
- 10/100 Mbit Ethernet TP/RJ45 LAN console
- Convex SPP-UX, a heavily modified Mach-based operating system, which looks familiar to HP-UX but is a completely different design. The later HP V-Class are able to run stock HP-UX (which was modified specially for the V-Class architecture).
- Exemplar System Architecture Hewlett-Packard/Convex (Januar 1997, archive.org mirror, access August 2008)
- SPP 2000 Architecture presentation (FTP, Postscript) Beth Richardson (N.d.: NCSA. Google archive accessed August 2008)
- A Comparative Evaluation of Hierarchical Network Architecture of the HP-Convex Exemplar (Postscript) Robert Castaneda, et al. (1997: in Proceedings of IEEE International Conference on Computer Design (ICCD’97) [there is a mirrored PDF version from citeseer (accessed August 2008)]