HP 9000 V2500 and V2600
|CPU||2-32 PA-8500 (V2500)
2-32 PA-8600 (V2600)
|Caches||1.5 MB L1|
|Bandwidth||CPU 7.5 GB/s
Mem 15 GB/s
I/O 1.9 GB/s
XBAR 15.3 GB/s
SCI 3.8 GB/s
SCI (CTI/SCA) links
The HP 9000 V2500 and V2600 are second generation scalable PA-RISC V-Class servers based on the Convex Exemplar architecture with up to 32 64-bit PA-RISC processors in a single cabinet. As their Convex SPP2000 predecessors, and contrary to their V2200/V2250 cousins, up to four systems can be interconnected via CTI links. The resulting combined system can have up to 128 CPUs and appears to the operating system as a single computer. Architecturally the interconnected V2500s/V2500s are ccNUMA computers.
The V-Class servers are based on a crossbar architecture — one central internal
component links the various computing resources to each other by forming matrix connections.
The V2500 and V2600 use HP’s own HyperPlane crossbar chipset, consisting of four central
crossbar ASICs and various other chipset components to attach memory, processors and I/O.
The architecture is a direct continuation from the Convex Exemplar — the
HP/Convex SPP1x00 and SPP2000
S-Class and X-Class use a similar crossbar system design based on GaA chips which
was upgraded for the V-Class with faster processors and memory.
A multi-node V2500/V2600 system architecture, SCA, does not conform fully to the
PA-RISC 2.0 reference architecture — the firmware layer emulates a reference-compliant
PA-RISC system for the operating system.
However several changes had to be made to the HP-UX kernel to accomodate the
V-Class’s special architecture, also called
The V2500s and V2600s are controlled via a
teststation, also called SSP, Service Support
Processor, that runs its own operating system and controls and monitors the V-Class server, a
a HP 9000/712 or B180L workstation.
Earlier Convex systems apparently used IBM RS/6000 workstations
running AIX to control the Exemplar systems.
The SSP/teststation connects to the Core Utilities Board CUB, which provides booting, system monitoring
and diagnostics, and console connections, connected via one LAN and one special serial link.
- V2500: 2-32 PA-8500 440 MHz with 512/1024 KB on-chip I/D L1 cache each
- V2600: 2-32 PA-8600 552 MHz with 512/1024 KB on-chip I/D L1 cache each
The V-Class V2500 and V2600 are based on the HP HyperPlane crossbar which connects the CPU and I/O to the system main memory.
- HyperPlane crossbar, 8x8, non-blocking, consists of four Routing Attachment controllers RACs
and is the central part of the system, it connects the memory to the processor buses and I/O channels.
There are eight ports for
agentsfor CPUs and I/O — each agent connects to two or four CPUs and one I/O channel —, and eight ports for memory. Each crossbar port has a path width of 64-bit, giving it 960 MB/s peak bandwidth. The peak bandwidth of the HyperPlane crossbar/RACs is 15.3 GB/s combined.
- Eight Processor Agent controllers (PACs), also SPAC, attach to the crossbar and provide access for the processor Runway buses and I/O controllers to the memory via the crossbar over a 1.9 GB/s datapath, four 32-bit, unidirectional buses from two ports on the PAC connect to two Hyperplane crossbar RACs; each PAC thus communicates with only two of the system’s four RACs. The I/O channels on the agent have a maximum bandwidth of 240 MB/s. Each PAC has two Runway processors buses with an aggregate peak bandwidth of 960 MB/s.
- Eight PCI-bus Interface controller (SAGA) connect the 240 MB/s I/O channels/PCI buses to the PACs.
- Eight Memory Access controllers (MACs), also SMAC, attach each one 32-way interleaved memory board to the Hyperplane crossbar. Each MAC has a bandwidth of 1.9 GB/s, four 32-bit, unidirectional buses from two ports on the MAC connect to two Hyperplane crossbar RACs
- The Core Utilities board (CUB) provides interrupts and the central system logic, it connects to the Midplane Interconnect Board MIB. The Core Logic Bus from the CUB attaches to the devices on the PACs.
- Eight Toroidal Access Controller (STACs) connect to a variation of the Scalable Coherent Interconnect SCI
to one or two
rings.The combination of STACs and SCI rings is referred to as Coherent Toroidal Interconnect CTI.
» View a system-level ASCII illustration of the crossbar architecture.
The remainder of the system I/O consist of standard HP PCI controllers, frequently shipped in default configuration with one of the following:
- PCI Fast-wide SCSI controller high-voltage differential/HVD
- PCI Ultra2-wide SCSI controller low-voltage differential/LVD
- PCI fibrechannel (FC) controller
- Total crossbar bandwidth 15.3 GB/s, intra-crossbar
- CPU bandwidth 7.5 GB/s, CPU-to-PAC, eight Runway 960 MB/s buses
- Memory bandwidth 15 GB/s, memory-to-crossbar, sixteen 960 MB/s links
- I/O bandwidth 1.9 GB/s, eight 240 MB/s channels, I/O channel-to-PAC
- PAC bandwidth, PAC-to-crossbar is also 15 GB/s theoretically, with sixteen 960 MB/s links for the eight PACs
- Eight PCI-64/33 I/O buses for expansion slots, each 240 MB/s
- Attachments to CTI/Scalable Computing Architecture SCA crossbar interconnection, 3.8 GB/s
- SCSI/storage buses depend on the installed SCSI adapter, most likely either Fast-wide or Ultra2-wide
- SDRAM DIMMs, 88-bit or 80-bit
- Two to eight memory boards
- Each memory board has 16 slots: four 4-slot
- Memory is up to 256-way interleaved
- 1 GB minimum
- 32 GB maximum
- 28 PCI 64-bit 33 MHz slots on eight PCI 64-bit channels
- 16 internal SCSI drives, exact type depending on installed SCSI adapter
Multiple V-Classes can be connected together to form a single large system resulting in
SCA, a scalable Computing Architecture system.
Up two four V2500/V2600s can be clustered together to form a system with up to
128 GB of RAM,
112 PCI slots and
64 SCSI drives.
Clustered V-Classes are ccNUMA computers and do not conform fully to the PA-RISC 2.0 specification.
Multiple systems are connected via two CTI rings: these links attach via the STACs to the eight memory controllers. The two rings are called X-ring and Y-ring. Each system attaches to one or two other V2500/V2600 cabinets and their respective crossbars with a node-to-note data rate of 3.8 GB/s. The links are implementation of the IEEE SCI standard taken over from Convex — Coherent Toroidal Interconnect or Convex Toroidal Interconnect. Each node’s main memory is globally accessible from other nodes on the CTI network, that is, local memory is globally shared. 32-512 MB of each system’s main memory is reserved for cache memory for the CTI network configured statically at boot time.
- SCSI depends on installed adapter, either Ultra or Fast wide
- Serial and Ethernet connections of the teststation/SSP
- Operator’s Guide HP 9000 V2500 Server (PDF) Hewlett-Packard Company (December 1998, first edition, A5075-90005)
- Installation Guide HP 9000 V2500 Server (PDF) Hewlett-Packard Company (December 1998, first edition, A5075-90001)
- Diagnostics Guide HP V2500/V2600 Servers (PDF) Hewlett-Packard Company (December 1999, first edition, A5824-96002)
- Upgrade Guide HP V2500/V2600 Servers (PDF) Hewlett-Packard Company (December 1999, first edition, A5824-96004)
- Architecture Reference Guide V2500 Server (PDF) Hewlett-Packard Company (June 1999, first edition, A5074-90004)
- HP Scalable Computing Architecture Randy Wright and Arun Kumar (October 2000/revised January 2002: USENIX, Proceedings of the First WIESS Workshop)