Andy Glew's comp-arch.net wiki, http://semipublic.comp-arch.net
If you are reading this elsewhere, e.g. at site waboba.info, it is an unauthorized copy, and probably a malware site.
comp-arch.net wiki on hold from October 17, 2011
Fixed Point
From CompArch
Fixed point formats are scaled integer formats. Instead of floating point, where the exponent scales the number, the binary point is at a specified, usually implicit, position inside the number.
Contents |
Examples of Fixed Point
16.16 fixed point - 32 bit 2's complement integer, signed, scaled by 16 bits (2^16).
Versus s15.16 - which I interpret as sign magnitude format.
Importance: OpenGL embedded, GL ES 1.0, GL embedded uses 16.16.
Terminology Confusion
Terminology confusion: IBM uses the term fixed point to refer to all integer operations. I, on the other hand, distinguish
- integer - binary point below the least significant bit
- binary fraction - binary point above the most significant bit
- fixed point - binary point somewhere in the middle
Although I suppose it is reasonable to think of integer and binary fractions as being subsets of fixed point.
Fixed Point Arithmetic
Addition is the same for all fixed point: integer, binary fraction, and fixed point in general.
Multiply is where you see the differences. The product of two N bit numbers is, in general, a 2N bit number.
If the fixed point is x.y, where x+y = N, then the product is 2x.2y.
Or, if you wish to multiply two fixed point numbers and end up with the same fixed point format, you must shift the double width product by y bits, the number of bits in the fixed point fraction. I.e. extract bits y+N:y of the double width product.
Note that rounding may need occur with respect to the lower bits that are shifted out. Where "need to occur" may mean that we provide the usual instruction set support for rounding modes, which is pretty much the same for fixed point as for floating point, i.e. round-{to,away-from}-{0,-inf,nearest-even}.
The more traditional issues with overflow also raise their head. With the usual observation that signed versus unsigned only matters for the high bits of an integer product.
Fixed Point Arithmetic Instructions
- MUL-SHIFT
- dest := (A*B) >> shift
- MUL-SHIFT-ADD
- dest := ((A*B) >> shift) + C
- MUL-SHIFT-ACC
- dest += ((A*B) >> shift)
- with rounding modes
- and frequently with provision of operands of different formats
- frequently, the accumulator is double width, or M+N bit if the operands are of different formats
- and a different binary point position.
- note that it is not necessary to specify the fixed point format of each operand separately - all gets rolled into the single shift count
Binary Fraction Indexed Addressing
Occasionally one hears about complicated addressing modes using integer multiply:
- Mem[ reg1_index*reg2_stride + reg3_base + constant_base_or_offset ]
This is usually considered overkill compared to the prototypical:
- Mem[ reg_base + reg1_index*scale + constant_base_or_offset ]
You will occasionally hear about fixed point, typically binary fraction, addressing modes
- Mem[ BASE + reg_fraction*SIZE ]
This is typically used for lookup tables. There are the usual issue with rounding that you have with any fixed point multiplication.
Note that I have said BASE and SIZE, leaving the option of register or constant for either.
For much the same reason, you will occasionally see proposals for memory addressing modes using floating point values in the address calculation: You will occasionally hear about fixed point, typically binary fraction, addressing modes
- Mem[ BASE + reg_fp_fraction*SIZE ]
Although I have said "fraction" above, the user may scale the value, allowing the SIZE operand to have fewer bits if a constant is embedded in the instruction.
See
http://jet.ro/files/The_neglected_art_of_Fixed_Point_arithmetic_20060913.pdf