The GRFPU is an IEEE-754 compliant floating-point unit, supporting both single and double precision operands. The advanced design combines high throughput with low latency, providing up to 250 MFLOPS on a 0.13 um ASIC process. The host interface is clean and versatile, simplifying the interfacing to processor pipelines and DSPs. The accuracy and convergence of the FPU algorithms have been proven mathematically, and the implementation has been validated with more than 20 million test vectors.
- IEEE-754 compliant, supporting all rounding modes and exceptions
- Operations: add, subtract, multiply, divide, square-root, convert, compare, move, abs, negate
- Data formats: single and double precision (32- and 64-bit floats)
- Fully pipelined, 3 clock cycles latency for all operations except divide and square-root
- Non-blocking parallel execution of divide and square-root operations
- Clean and versatile interface
- LEON FP Control unit available
- Supports all SPARC V8 floating-point instructions
- 250 MHz (250 MFLOPS) on a typical 0.13um standard cell process using less than 100 kgates
- 65 MHz (65 MFLOPS) on a Virtex-II FPGA using approximately 8,500 LUTs
- Fault-tolerant (FT) version available
Functional Description The GRFPU performs operations on single and double precision floating-point operands. All operations are IEEE-754 compliant, with exception of denormalized numbers which are flushed to zero. The specified four rounding modes and the detection of exception conditions is fully supported.
An FPU operation is started by providing the operands, opcode and rounding mode on a rising clock edge. The result and the exception flags will be available three clocks later. The FPU is fully pipelined and a new operation can be started every clock cycle. The only exceptions are the FDIV and FSQRT instructions which require between 15 and 24 clock cycles to complete, and which are not pipelined. They are however calculated in a separate non-blocking execution unit, allowing all other operations to be performed in parallel without stalling the FPU pipeline. The table below summarises the throughput and latency of the supported operations:
|FADDS, FADDD, FSUBS, FSUBD,FMULS, FMULD, FSMULD
||Add, subtract, multiply
|FITOS, FITOD, FSTOI, FDTOI, FSTOD, FDTOS
||Convert between floats and integers
|FCMPS, FCMPD, FCMPES, FCMPED
The GRFPU core has been extensively validated with a large set of test vectors. Special test programs such as TestFloat, UCBTEST and IEEE CC754 has been used, as well as floating-point based application software.
LEON FPU Control Unit
The GRFPU can be attached to LEON2 and LEON3 processors
through the LEON FPU Control unit (GRFPC). The control unit receives SPARC FPU instructions (FPOP) from the LEON integer unit, and schedules them for execution by the FPU. The FPOPs are executed in parallel with other integer instructions, the LEON pipeline is only stalled in case of operand or resource conflicts. The GRFPC also includes the FPU register file, the processor floating-point status register (FSR) and a deferred trap queue. The GRFPC is available for all versions of the LEON processor.
The GRFPC requires approximately 4,000 LUTs on a Virtex-II FPGA or 20 kgates on a typical 0.13 um process.
The fault-tolerant version of GRFPU and GRFPC includes SEU protection by design. The FPU register file is protected using (39,7) BCH coding, while all other registers are protected with TMR.
For evaluation purposes, Xilinx and Altera netlists of GRFPC/GRFPU for LEON2/3 are available from the LEON download page
GRFPU and GRFPC are available immediately and licensed together.
The GRFPU has been used several critical applications, in particular in the AGGA3
GPS/Galileo device and the COLE
Spacecraft controller ASIC.