This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

- Author:
- Joachim Schimpf
- Date:
- 14 Oct 2009
- Modified:
- 20 Oct 2009

- negative zeros
- negative zeros + infinity
- negative zeros + infinity + NaN
- none of the above
- fully featured IEC 559/IEEE 754 arithmetic
- different sizes of mantissa and exponent
- binary or decimal float representations
- ...

- accommodate different floating-point types in one system
- accommodate non-sign-magnitude representations
- define a full interface required for IEEE 754 conformity (but make sure this can be done via implementation-specific additions)

- Change 7.9.2 and 5.5.10 to allow implementation defined choice between exception raising and the delivery of continuation values, see here and here
- Correct mistakes in exception specifications in section 9, see here
- Add float-related flags, see here
- Add copysign/2 function, see here
- Add nexttoward/2 function, see here
- Add syntax definitions for infinity and NaN, see here
- Change wording in section 8.7.2 to allow for exact comparison, see here
- Amend 8.13.1.3 to accomodate floating point overflow, see here

- r - radix
- p - precision
- emin - smallest exponent
- emax - largest exponent
- denorm - whether denormalised numbers are contained

- fmax - largest representable finite number
- fmin - smallest representable positive number
- epsilon - maximum relative error r^(1-p)

The semantics of a language arithmetic is defined in terms of operations on F, which in turn are defined in terms of the common mathematical operations on R (of which F is a subset), plus a rounding function. Since infinity and NaN are not members of R, they cannot be covered this way, nor can signed zeros be distinguished. This is an intrinsic problem with basing the Prolog semantics on LIA.

Floating point results in LIA can be numbers in F, or 'exceptional
values' like *float_overflow*, *undefined*, *underflow*
(but these are just conceptual things).
LIA section 6 allows different 'notifications' of these 'exceptional
values', among them the raising of language exceptions, but also
the production of 'continuation values' which are not members of F.
This is where infinities and NaNs fit in: they are implementation-defined
ways of providing notification of the conceptual exceptions.
So, a language implementation is allowed to do this by defining which
'continuation values' are produced for which exception (to do this
with IEEE arithmetic is rather straightforward).

The Prolog standard however forbids this by stating in 7.9.2 and 9.1.2 that an 'exceptional value' result E gives rise to an evaluation_error(E). This must be changed.

Once 'continuation values' are introduced, they can occur as arguments of arithmetic operations. The problem is then to define the operations in these cases. LIA is no help here, since it only describes operation on F. We are faced with 3 alternatives:

- say that all handling of non-F argument values is defined in IEEE 754 (with the straightforward mapping between IEEE exceptions and Prolog/LIA exceptions)
- define it in the Prolog standard explicitly (lengthy)
- make it implementation-defined

**
Suggestion: Explicitly refer to IEEE 754 for the specification of
behaviour with non-F values. For operations not in IEEE 754,
define it in the Prolog standard.
**

This means that a Prolog does not have to implement IEEE 754 arithmetic, but if it provides infinities and NaNs, they must behave like IEEE 754 ones for the operations that are relevant for Prolog. This seems a more helpful solution than leaving it implementation-defined.

- float_overflow (IEEE 754 OVERFLOW)
- zero_divisor (IEEE 754 DIVIDEBYZERO, not in LIA)
- underflow (IEEE 754 UNDERFLOW)
- undefined (IEEE 754 INVALID)

** Suggestion: change wording to make evaluation_errors implementation
defined.**

The implementation shall have the option of raising evaluation_error(E) or continuing with an implementation-defined 'continuation value'. For each exception individually, an implementation can choose between

- evaluation_error(float_overflow) and returning infinity
- evaluation_error(zero_divisor) and returning infinity
- evaluation_error(undefined) and returning NaN (or a member of a different type, like complex number)
- evaluation_error(underflow) and returning a subnormal or zero

An *undefined* exception can occur with:

- sqrt(Neg)
- as before
- log(ZeroNeg)
- change: log(0) now to
*zero_divisor* - Neg**Nonint
- as before
- 0*inf
- to be added
- 0/0
- was
*zero_divisor*in 13211-1 - inf/inf
- to be added
- inf-inf
- to be added
- inf+-inf
- to be added

A *zero_divisor* exception can occur with:

- NonZero/0
- change: 0/0 now
*undefined* - log(0)
- was
*undefined*exception in 13211-1 - 0**Neg
- was
*undefined*exception in 13211-1 - atanh(+-1)
- to be added, if implemented

*underflow* exceptions can occur in X**Y, add, sub, mul, div.

**Corrections:**

- 0.0/0.0 must raise
*undefined*rather than*zero_divisor* - log(0.0) must raise
*zero_divisor*rather than*undefined* - 0.0^Neg must raise
*zero_divisor*rather than*undefined*

- arithmetically compare equal -0.0 =:= 0.0, i.e. use the comparison defined by IEEE 754
- syntactically compare unequal -0.0 \== 0.0
- term order is -0.0 @< 0.0
- sign must be preserved by read and canonical write

- have a canonical textual form, see here
- in arithmetic behave as required by the IEEE 754
- otherwise behave as additional constants of type float
- the term order of floats extends naturally

- have a canonical textual form, see here
- in arithmetic behave as required by IEEE 754
- otherwise behave as additional constants of type float
- have an implementation-defined term order, and be individually distinguishable in =/2 and ==/2.

In my opinion, while infinities are very useful to have in Prolog because they extend the computation domain in a natural way, NaNs fit much less naturally. Binding a variable to a special value that says "there is no solution" isn't the Prolog way, the logical behaviour is to fail when there is no solution (although the logical failure might be replaced by an exception for practical purposes, analogous to type failures/errors). Adding NaNs seems to create far more problems than it solves, e.g. regarding comparisons.

?- 18014398509481985 > 18014398509481984.0. No ?- 18014398509481985 < 18014398509481984.0. No ?- 18014398509481983 > 18014398509481984.0. No ?- 18014398509481983 < 18014398509481984.0. NoInfinities are a special case of this:

?- 2^1024 < 1.0Inf. NoThis is because the integer is inexactly converted to float, then the comparison is done with floats. This semantics is prescribed by the standard (section 8.7.2). It would be possible (and advocated in [OKPL]) to require, or at least allow, exact comparison instead, which means changing section 8.7.2.

Another way to fix it is to raise an exception on inexact (implicit) conversion, i.e. earlier than overflow. The above examples would then raise an exception, which the programmer can circumvent by an explicit float/1 conversion. It would not give a way to get the correct result though.

**
Suggestion: change section 8.7.2 to allow all of current behaviour, inexact
exception, or exact comparison.
**

min(0.0,-0.0) -0.0 see below min(-0.0,0.0) -0.0 see below max(0.0,-0.0) 0.0 max(-0.0,0.0) 0.0 max(1.0Inf,Any) 1.0Inf min(1.0Inf,Any) Any max(-1.0Inf,Any) Any min(-1.0Inf,Any) 1.0Inf min(NaN,Any) Any see below max(NaN,Any) Any see belowIEEE 754 (2008) relaxes min(-0.0,0.0), not specifying the sign of the result! This is probably not advisable for Prolog, where the zeros can easily be told apart.

IEEE 754 (2008) specifies minNum() and maxNum() in this way, i.e. preferring the non-NaN argument. [KAH] says there are good reasons to do so, but that they are disputed.

0.0 -0.0 1.0Inf -1.0Inf NaN Remark Standard Unary Functions +/1 0.0 -0.0 1.0Inf -1.0Inf NaN -/1 -0.0 0.0 -1.0Inf 1.0Inf NaN abs/1 0.0 0.0 1.0Inf 1.0Inf NaN sqrt/1 0.0 -0.0 1.0Inf Ex/NaN NaN (1) sign/1 0.0 0.0? 1.0 -1.0 NaN (see below) float_i_part/1 0.0 -0.0 1.0Inf -1.0Inf NaN float_f_part/1 0.0 -0.0 0.0 0.0 NaN float/1 0.0 -0.0 1.0Inf -1.0Inf NaN sin/1 0.0 -0.0 Ex/NaN Ex/NaN NaN cos/1 1.0 1.0 Ex/NaN Ex/NaN NaN tan/1 0.0 -0.0 Ex/NaN Ex/NaN NaN asin/1 0.0 -0.0 Ex/NaN Ex/NaN NaN acos/1 pi/2 pi/2 Ex/NaN Ex/NaN NaN atan/1 0.0 -0.0 pi/2 -pi/2 NaN exp/1 1.0 1.0 1.0Inf 0.0 NaN log/1 -1.0Inf -1.0Inf 1.0Inf Ex/NaN NaN (4) floor/1 0 0 Ex Ex Ex (2),(5) truncate/1 0 0 Ex Ex Ex (5) round/1 0 0 Ex Ex Ex (5) ceiling/1 0 0 Ex Ex Ex (5) Non-standard unary functions (for illustration) (5) ffloor/1 0.0 -0.0 1.0Inf -1.0Inf NaN (2) ftruncate/1 0.0 -0.0 1.0Inf -1.0Inf NaN (3) fround/1 0.0 -0.0 1.0Inf -1.0Inf NaN (3) fceiling/1 0.0 -0.0 1.0Inf -1.0Inf NaN (3)Remarks:

- (1)
- sqrt(-0.0) is defined as -0.0 in
[sqrt],
i.e. it does not behave like a small negative number (which would
be
*undefined*). - (2)
- floor(-0.0) is defined as returning -0.0 (rather than -1.0) in [floor], i.e. it does not behave like a small negative number.
- (3)
- ceil/trunc/round(-0.0) are all defined as -0.0, not 0.0 in [ceil], [trunc], [round].
- (4)
- log(-0.0) is -1.0Inf (rather than domain error) according to
[log],
i.e. it does not behave like a small negative number (which would
be
*undefined*). - (5)
- It is regrettable that the standard defines floor, ceiling, truncate and round with an implicit conversion to integer type, because this makes them partial functions. We would have been better off having these return floats (or whatever the argument type was), and have a separate pure type conversion to integer. It's probably too late to change that. I also seem to recall that it was in an earlier standard draft, but was later rejected.

13211-1 SP3.11 SWI5.6/Yap5.1/ECL6.0 sign(5) 1 1 1 sign(0) 0 0 0 sign(-5) -1 -1 -1 sign(5.0) 1.0 1.0 1 sign(0.0) 0.0 0.0 0 sign(-0.0) unspec 0.0 0 sign(-5.0) -1.0 -1.0 -1Unlike earlier drafts, the final 13211-1 specifies 0.0 (rather than 1.0) as the result of sign(0.0). This means we cannot simulate copysign(X,Y) via X*sign(Y), in particular not for copysign(X, -0.0). This was modelled after the signF operation in LIA. LIA changed its mind in a recent draft update (2007) and replaced signF with a function signumF wich returns 1.0 (for positive numbers and 0.0) and -1.0 (for negative numbers and -0.0).

How to specify sign(-0.0) is a bit of a dilemma. Returning 0.0 is nice in the sense that there are only 3 discrete sign values, like in the integer case (but why is the result then not integer -1,0,1 in the first place?). On the other hand, simply losing the sign bit of -0.0 looks wrong for a float to float operation. That's probably a reason why IEEE 754 does not define such an operation. As a last resort, it could be left implementation-defined.

[remark added 2011-03-08: Java defines signum() as returning -0.0 for -0.0, which would be a way to subtly improve the current sign/1, but is still no replacement for copysign/2.]

+(-0.0,0.0) 0.0 +(0.0,-0.0) 0.0 +(-0.0,-0.0) -0.0 +(-0.0,NZ) NZ (NZ nonzero) +(1.0Inf,-1.0Inf) Ex/NaN +(1.0Inf,NotInf) 1.0Inf +(-1.0Inf,1.0Inf) Ex/NaN +(-1.0Inf,NotInf) -1.0Inf +(NaN,Any) NaN (Any is not NaN)

-(0.0,0.0) 0.0 -(-0.0,0.0) -0.0 -(0.0,-0.0) 0.0 -(-0.0,-0.0) 0.0 -(-0.0,NZ) -NZ -(NZ,-0.0) NZ -(-1.0Inf,-1.0Inf) Ex/NaN -(1.0Inf,1.0Inf) Ex/NaN -(1.0Inf,-1.0Inf) 1.0Inf -(-1.0Inf,1.0Inf) -1.0Inf -(1.0Inf,NotInf) 1.0Inf -(NotInf,1.0Inf) -1.0Inf (NotInf is not infinity) -(-1.0Inf,NotInf) -1.0Inf -(NotInf,-1.0Inf) 1.0Inf -(NaN,Any) NaN -(Any,NaN) NaN

*(-0.0,Pos) -0.0 *(-0.0,Neg) 0.0 *(1.0Inf,1.0Inf) 1.0Inf *(1.0Inf,-1.0Inf) -1.0Inf *(-1.0Inf,-1.0Inf) 1.0Inf *(1.0Inf,NotInf) 1.0Inf *(-1.0Inf,NotInf) -1.0Inf *(NaN,Any) NaN

/(-0.0,0.0) Ex/NaN /(-0.0,-0.0) Ex/NaN /(-0.0,PNZ) -0.0 (PNZ positive nonzero) /(-0.0,NNZ) 0.0 (NNZ negative nonzero) /(PNZ,-0.0) -1.0Inf /(NNZ,-0.0) 1.0Inf /(-1.0Inf,-1.0Inf) Ex/NaN /(1.0Inf,1.0Inf) Ex/NaN /(1.0Inf,-1.0Inf) Ex/NaN /(-1.0Inf,1.0Inf) Ex/NaN /(1.0Inf,PNZF) 1.0Inf (PNZF pos nonzero finite) /(PNZF,1.0Inf) 0.0 /(-1.0Inf,PNZF) -1.0Inf /(PNZF,-1.0Inf) -0.0 /(1.0Inf,NNZF) -1.0Inf (NNZF neg nonzero finite) /(NNZF,1.0Inf) -0.0 /(-1.0Inf,NNZF) 1.0Inf /(NNZF,-1.0Inf) 0.0 /(NaN,Any) NaN /(Any,NaN) NaN

Negative zeros occur as results as specified in IEEE 754, e.g.

-0.0 is 0.0 * -1.0. -0.0 is 0.0 / -1.0.and additionally in

float_integer_part(-0.01) -0.0 float_fractional_part(-2.0) -0.0 float_integer_part(-0.0) -0.0 float_fractional_part(-0.0) -0.0The latter two result from the requirement that

X is float_integer_part(X) + float_fractional_part(X).

- The Language Independent Arithmetic Standard [LIA] to which the Prolog standard refers.
- IEEE 754 (2008) floating point standard
- Richard O'Keefe's library environ.pl [OKE] which defines values describing among others the floating point properties of a Prolog implementation.
- The 2009 core update draft which suggested the flags float_mantissa_digits, float_min_exponent, float_max_exponent and float_epsilon.

INCLUSION IN STANDARD RECOMMENDED | ||
---|---|---|

Flag | Example (double) | Comment |

float_radix | 2 | Radix of the representation [LIA] |

float_precision | 53 | Digits in radix base [LIA] |

float_emin | -1022 | Smallest exponent in radix base [LIA] |

float_emax | 1023 | Largest exponent in radix base [LIA] |

float_denorm | true | Whether denormalised values are used [LIA] |

float_iec_559 | true | Flag required by [LIA]. Indicates full conformity to IEC 559 (IEEE 754) |

float_min | 2.2250738585072014e-308 | Smallest representable positive number. Can be computed from the above, but tediously. Can be computed as nexttoward(0.0,1.0) |

float_max | 1.7976931348623157e+308 | Largest representable finite number. Can be computed from the above, but tediously. Can be computed as nexttoward(1.0Inf,0.0) if infinity is supported. Flag needed to raise representation_error(float_max) in term reading. |

float_max_integer | 9007199254740992.0 | Largest integer that can be represented accurately. Can be computed as radix**precision |

float_overflow | infinity | Behaviour on float_overflow during computation or reading (error, infinity) [OKE] |

float_zero_div | infinity | Behaviour on zero_divisor exception (error, infinity) |

float_undefined | error | Behaviour on undefined result (error, nan) |

float_underflow | ignore | Behaviour on underflow (ignore, error) |

float_rounding | to_nearest | Rounding mode (to_nearest, to_nearest_tta, to_positive, to_negative, to_zero). May be writeable. |

MEANINGFUL, BUT INCLUSION IN STANDARD *NOT* RECOMMENDED | ||

Flag | Example (double) | Comment |

float_inexact | ignore | Behaviour on inexact result (ignore, error). As the standard does not define this exception, this is reserved for implementations that do. |

float_min_integer | -9007199254740992.0 | Can be computed as -max_integer |

float_min_exponent | -307 | Only useful in relation to decimal I/O. Can be computed as ceiling((emin-1)*log(radix,10)) |

float_max_exponent | 308 | Only useful in relation to decimal I/O. Can be computed as floor((emax+1)*log(radix,10)) |

float_epsilon | 2.2204460492503131e-16 | Likely to be misused. nexttoward/2 is more useful. Can be computed as radix**(1-precision), or nexttoward(1.0,2.0)-1.0 |

float_digits | 16 | Number of significant decimal digits. Can be computed more precisely as precision/log(radix,10) |

float_width | 24 | Number of characters required to canonically write a float |

float_format | ieee(double) | A term describing the underlying implementation. Difficult to define the possible values, e.g. when it's 'ieee with some features missing'. |

nexttoward(X,Y) evaluates the expressions X and Y with values VX and VY and computes the next representable floating-point value following VX in the direction of VY. Examples:

nexttoward(1.0,2.0) 1.0000000000000002 nexttoward(1.0,-1.0) 9.99999999999999889e-01 nexttoward(1.0Inf,0.0) 1.7976931348623157e+308 nexttoward(0.0,1.0) 4.94065645841246544e-324 nexttoward(9007199254740992.0,1.0Inf) 9007199254740994.0 nexttoward(1.7976931348623157e+308,1.0Inf) 1.0Inf nexttoward(0.0,-1.0) -4.94065645841246544e-324 nexttoward(-0.0,-1.0) -4.94065645841246544e-324 nexttoward(-4.94065645841246544e-324,1.0) -0.0

copysign(X,Y) evaluates the expressions X and Y with values VX and VY and produces a value with the magnitude of VX and the sign of VY. Examples:

copysign(3.0, 2.0) 3.0 copysign(3.0, 0.0) 3.0 copysign(3.0,-0.0) -3.0 copysign(3.0,-2.0) -3.0 copysign(Any,1.0NaN) Any copysign(Any,-1.0NaN) -Any copysign(Any,PNZ) Any copysign(Any,NNZ) -Any copysign(Any,0.0) Any copysign(Any,-0.0) -AnyNote the significance of the sign on zero and NaNs (copysign doesn't propagate NaNs like other functions).

The canonical syntax for a negative zero shall be **-0.0**.
On input, any representation of a negative number too small to represent
may be rounded to -0.0.
Processors that do not implement negative zeros shall produce positive
zero instead.

The canonical syntax for positive infinity shall be **1.0Inf**, and **-1.0Inf**
for negative infinity. On input, the actual values of the digits are
ignored, so 3.456Inf=1.0Inf, and even 0.0Inf=1.0Inf.
Note that the syntax is chosen not to conflict with other valid Prolog
syntax, in particular, the suffix is required to start with a capital
letter, so that 1.0inf can still serve as postfix syntax for inf(1.0).
Processors that do not implement an infinity representation shall
raise a representation_error(float_overflow) exception during reading.

The canonical syntax for NaNs shall be **1.xxxNaN** or **-1.xxxNaN**.
The value before the NaN suffix is obtained by changing the NaN's
exponent to 0. While this printed representation does not allow
for easy human interpretation of the NaN's bit pattern, it requires
the minimum in terms of syntax extension. Processors that support
NaNs are free to provide an operation to decode NaNs programmatically,
and to support other (non-canonical) output formats.
Again, the NaN suffix shall have exactly this capitalisation,
to avoid syntax conflict with a possible postfix operator.
Processors that do not implement a NaN representation shall
raise a representation_error(float_undefined) exception during reading.

**Suggestion: modify 6.4.5 to allow the above forms, and add the
possible exceptions to the predicates of the read-family**.

SICStus and Yap use +inf as syntax for infinities, which conflicts with the parsing of +(inf) in prefix form. Some degree of compatibility can be achieved by making inf/0 a function returning infinity.

Pitfall: a printed representation that reads back correctly on one system may not do so on another, because of different rounding behaviour of the read routines. The standard can probably interpreted to mean that only the same Prolog processor has to read the value back accurately.

**Suggestion: add additional error condition to 8.14.1.3**

- Reduce implementation-definedness for (//)/2. Require rounding towards zero, and get rid of the integer_rounding_function flag.
- Add the div/2 function that always rounds down and is the division corresponding to mod/2.
- Reduce implementation-definedness for the bitwise operations by defining them to act as if on two's complement numbers, as argued in [OKPL].

- ISO 31211-1 is the Prolog standard from 1995
- [LIA] ISO 10967-1 is the Language Independent Arithmetic standard from 1994. Part 1, Part 2. See also 2007 draft
- IEEE 754 is the standard on floating point numbers, originally from 1985, updated in 2008. IEEE final, Earlier draft.
- Discussion on comparison variants
- [KAH] W.Kahan, Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic
- [OKE] library(environment) by Richard A. O'Keefe.
- [OKPL] Richard A. O'Keefe: Am elementary Prolog Library
- Discussion on floats in Prolog comp.lang.prolog also summarised here