PA-RISC Processors
PA-8000 (PCX-U) processor
Overview
The PA-8000 is the first 64-bit PA-RISC 2.0 processor, and included out-of-order execution capabilities for the first time. It was released in 1996, in parallel to the 32-bit low-cost PA-7300LC processor. The PA-800 had four integer, four floating-point and dual load/store units, a large OoO dispatch window and no on-chip caches. It was the first chip to implement 64-bit PA-RISC 2.0 architecture to support 64-bit computing, which included 64-bit wide integer registers and functional units like ALU and a flat virtual address space of 64-bit. Other extensions in the PA-8000 included fast TLB insert instructions, memory prefetch instructions, support for variable sized pages, branch prediction hinting and new floating point units (FPMAC).
A key design feature of the PA-8000 and all following PA-RISC 2.0 processors was the IRB, the Instruction Reorder Buffer, which enables the processor to perform its own instruction scheduling in hardware, independent of compiler or software technologies. The IRB is the key part for the out-of-order capabilities of the PA-8000, and can store up to 28 computation and 28 load/store instructions, tracks interdepencies between these instructions and allows execution as soon as they are ready.
All later PA-8x00 processors up to the PA-8900 include slightly modified PA-8000 cores with only slight extensions plus later much bigger caches.
Details
- PA-RISC version 2.0, 64-bit architecture, multi-processor capable, 4-way superscalar
- Ten functional units: 2 integer ALUs, 2 shift/merge units, 2 complete load/store pipelines, 2 Floating Point multiply/accumulate units, 2 Floating Point divide/square root units
- IRB: 56-entry instruction queue/reorder buffer
- TLB: 96-entry fully-associative dual-ported
- BTAC: 32-entry Branch Target Address Cache; BHT: 256-entry Branch History Table
- Cache 1 MB instruction and 1 MB data L1 off-chip, in synchronous 150 MHz 1 Mb SRAMs, one cycle latency
- Caches are direct-mapped and dual-ported
- Memory and I/O controllers are external
- Bi-endian support
- MAX-2 multimedia extensions subword arithmetic for multimedia applications
- Runway system bus, 120 MHz, 64-bit, about 960 MB/s peak bandwidth
- Up to 180 MHz, clock speed with 3.3 V core voltage
- 17.7×19.6 mm2 die, 4,500,000 FETs, 0.5µ, 5-layer metal CMOS packaged in a 1,085-pin flip-chip LGA package
Used in
- HP 9000 C160, C180 workstations
- HP 9000 D270, D280, D370, D380 servers
- HP 9000 J280, J282 workstations
- HP 9000 K250, K260, K450, K460 servers
- HP 9000 R380 servers
- HP 9000 T600 mainframes
- HP/Convex SPP2000 (S-Class/X-Class) mainframes
- NEC TX7/D280, TX7/K370, TX7/P590 servers
- Stratus Continuum 628, 1228 mainframes
References
- Advanced Performance features of the 64-bit PA-8000 (archive.org mirror) Doug Hunt (1995: IEEE CompCon 5)
- PA-8000 Combines Complexity and Speed (archive.org mirror) Linley Gwennap (1994: Microprocessor Report, Volume 8 Number 15)
- Four-Way Superscalar PA-RISC Processors (.pdf) Anne P. Scott et al (August 1997: Hewlett-Packard Journal)
- The HP PA-8000 RISC CPU A High Performance Out-of-Order Processor (.pdf) [link gone] Ashok Kumar (August 1996: IEEE Hot Chips VIII)