Programming the Floating Point Unit (FPU)
Hardware
The numeric coprocessor/FPU has eight 80-bit registers named ST0, ST1, ..., ST7. These registers are organized as a stack, however, one can also access any particular register in FPU operations. ST0 is the stack head.
ST0 |
ST1 |
---|
ST2 |
ST3 |
ST4 |
ST5 |
ST6 |
ST7 |
An attempt to push more that 8 values into fstack results in an exception.
Instr. | Comments |
---|---|
FINIT | initialize the FPU and clean fstack |
FPU has its own 16-bit status register. The comparison operations set particular bits of this register. However, FPU status register cannot be directly used for conditional jumps.
Instr. | Comments |
---|---|
FSTSW mem | copy status reg to mem word |
FSTSW AX | copy status reg to AX |
SAHF | store AX into the FLAGS reg |
LAHF | load the FLAGS reg in AX |
FPU also has a 16-bit control register. This register determines, for example, how to convert a floating-point number in FPU into integer representation. By default, the number is rounded-off to the nearest integer.
Instr. | Comments |
---|---|
FSTCW mem | copy control reg to mem word |
FLDCW mem | load a mem word into the control reg |
Loading and Storing instructions
In the following instructions, src can be a word, dword, qword, or an FPU register, and mem is a memory location of a word, dword, or qword.
Loading instr. | Storing instr. | |||
---|---|---|---|---|
FLD src | push src into fstack | FST src | copy ST0 to src | |
FILD mem | push an integer mem into fstack | FSTP src | copy ST0 to src, then pop | |
FLD1 | push 1.0 into fstack | FIST mem | copy ST0 as int to mem | |
FLDZ | push 0.0 into fstack | FISTP mem | copy ST0 as int to mem, then pop | |
FLDPI | push pi (3.14...) into fstack | |||
FLDL2E | push log2e into fstack | FXCH STn | exchange ST0 and STn | |
FLDL2T | push log210 into fstack | FXCHP STn | exchange ST0 and STn, then pop | |
FLDLG2 | push log102 into fstack | |||
FLDLN2 | push loge2 into fstack |
It is easily seen that FXCHP STn instruction is equivalent to FSTP STn.
Addition and Multiplication instructions
In the following instructions, STn is an FPU register, mem is a memory location of a dword or qword, and src is an FPU register or a memory location of a dword or qword.
Addition instr. | Multiplication instr. | |||
---|---|---|---|---|
FADD src | ST0 += src | FMUL src | ST0 *= src | |
FADD STn, ST0 | STn += ST0 | FMUL STn, ST0 | STn *= ST0 | |
FADDP STn, ST0 | STn += ST0, then pop | FMULP STn, ST0 | STn *= ST0, then pop | |
FIADD mem | ST0 += (float)mem | FIMUL mem | ST0 *= (float)mem |
Subtraction and Division instructions
In the following instructions, STn is an FPU register, src is an FPU register or a memory location of a dword or qword, and imem is a memory location of an integer dword or qword.
Subtraction instr. | Division instr. | |||
---|---|---|---|---|
FSUB src | ST0 -= src | FDIV src | ST0 /= src | |
FSUBR src | ST0 = src - ST0 | FDIVR src | ST0 = src / ST0 | |
FSUB STn, ST0 | STn -= ST0 | FDIV STn, ST0 | STn /= ST0 | |
FSUBR STn, ST0 | STn = ST0 - STn | FDIVR STn, ST0 | STn = ST0 / STn | |
FSUBP STn, ST0 | STn -= ST0, then pop | FDIVP STn, ST0 | STn /= ST0, then pop | |
FSUBRP STn, ST0 | STn = ST0 - STn, then pop | FDIVRP STn, ST0 | STn = ST0 / STn, then pop | |
FISUB imem | ST0 -= (float)imem | FIDIV imem | ST0 /= (float)imem | |
FISUBR imem | ST0 = (float)imem - ST0 | FIDIVR imem | ST0 = (float)imem / ST0 |
Comparison instructions
In the following instructions, src is an FPU register or a memory location of a dword or qword, and imem is a memory location of an integer dword or qword.
Instr. | Comments |
---|---|
FCOM src | compare ST0 and src |
FCOMP src | compare ST0 and src, then pop |
FCOMPP | compare ST0 and ST1, then pop twice |
FICOM imem | compare ST0 and (float)imem |
FICOMP imem | compare ST0 and (float)imem, then pop |
FTST | compare ST0 and 0 |
Starting from Pentium PRO the CPU also supports two new instructions that directly modify the CPU's FLAGS register.
Instr. | Comments |
---|---|
FCOMI STn | compare ST0 and STn |
FCOMIP STn | compare ST0 and STn, then pop |
The bits from the status register are transfered into FLAGS so that they are analogous to the result of a comparison of two unsigned numbers.
Miscellaneous instructions
Instr. | Comments |
---|---|
FCHS | ST0 = -ST0 |
FABS | ST0 = |ST0| |
FSQRT | ST0 = sqrt(ST0) |
FRNDINT | round ST0 to integer |
FSCALE | ST0 = ST0 * 2floor(ST1) |
The FSCALE instruction is used for a quick multiplication of ST0 by a power of 2. The value of ST1 is not removed from fstack.