From: Richard A. O'Keefe <ok_at_cs.otago.ac.nz>

Date: Fri, 17 Nov 2006 15:29:15 +1300 (NZDT)

Date: Fri, 17 Nov 2006 15:29:15 +1300 (NZDT)

Joachim Schimpf <js10_at_crosscoreop.com> asks

[Why NOT apply floating-point contagion to max/2 and min/2?]

Perhaps we can set some ground rules from which we can deduce an

answer, and if that goes against my preferences, sobeit. All the rules

are "defeasible"; preferences rather than rigid requirements.

{Definiteness}

Arithmetic operations are not Prolog predicates. They are supposed to

be pure functions. For given inputs, either there should be a common

error, or there should be a good approximation to a common exact result.

This means that we would like to have a common specific answer to

max(X, Y), even when X =:= Y.

HOWEVER, floating-point contagion is NOT sufficient to satisfy this

preference. According to IEEE rules, -0.0 =:= +0.0, so if we ask

for max(-0.0, +0.0), which do we get? The answers *can* be told

apart, and since the arguments are both floating-point numbers,

floating-point contagion does nothing to answer the question.

This means that if we do adopt floating-point contagion for max/2

and min/2, we *still* need another rule to fix this problem, and

standard term order (which I hope orders -0.0 before +0.0; that's

really the only sensible thing it can do) is the obvious candidate.

{Least Surprise}

When we have a choice of definitions, we should adopt that one which

is least likely to mislead programmers. If programmers expect

floating-point contagion, and don't get it, they will be confused.

Those statically typed compiled languages which allow mixed-mode

arithmetic (and a thousand blessings on Ada for *not* allowing it;

causes more trouble than it's worth) DO apply a floating-point

contagion rule. Typically this is done indirectly. For example,

java.lang.Math has

public static int max(int a, int b);

public static long max(long a, long b);

public static float max(float a, float b);

public static double max(double a, double b);

That looks as though mixed mode isn't allowed, but the overload

resolution rules for Java allow implicit conversions to take place,

and the effect of contagion is thus produced.

C gets it via a "balancing" rule for conditional expressions:

#define max(x, y) ((x) < (y) ? (y) : (x))

selects the original value, and then converts it to a common type.

So Fortran, Algol 68, PL/I, Pascal, C, C++, and Java programmers

will be surprised if floating-point contagion DOESN'T apply.

On the other hand, what is Prolog doing in that galley?

Prolog belongs to the class of interactive dynamic latently typed

languages, such as Smalltalk, Python, and Scheme.

5

The Revised Report on Scheme says about max and min:

If any argument is inexact, then the result will also be inexact

(unless the procedure can prove that the inaccuracy is not large

enough to affect the result, which is possible only in unusual

implementations). If min or max is used to compare numbers of

mixed exactness, and the numerical value of the result cannot be

represented as an inexact number without loss of accuracy, then

the procedure may report a violation of an implementation

restriction.

That looks to me uncommonly like the compromise from a disagreement

between implementors who wanted pure selection and implementors who

wanted contagion. I don't imagine that the result pleased anybody:

the contagion-lovers cannot rely on getting it, the contagion-haters

cannot rely on avoiding it, and when bignum meets flonum it might or

might not kill the program. Someone who has read the Scheme standard

attentively won't be surprised by *anything* we choose to do.

On the other hand, Python and Smalltalk do NOT apply floating-point

contagion. ANSI Smalltalk, section 5.6.1.6:

(receiver) max: operand

Answer the receiver if it is greater than operand.

Answer operand otherwise.

Section 5.6.1.7:

(receiver) min: operand

Answer the receiver if it is less than operand.

Answer operand otherwise.

Floating-point contagion is clearly forbidden to apply to max: and

min: in ANSI Smalltalk. If x and y are equal, this says that

(x max: y) and (x min: y) should both return y. That's what happens.

The Python documentation says

Python fully supports mixed arithmetic: when a binary arithmetic

operator has operands of different numeric types, the operand

with the "narrower" type is widened to that of the other, where

plain integer is narrower than long inger is narrower than

floating point is narrower than complex. Comparisons between

numbers of mixed type use the same rule.

However, max and min are neither binary arithmetic operators nor

comparisons. When x == y, max(x, y) and min(x, y) both return x.

CMU Common Lisp acts just like Python here: (max 1 1.0) ==> 1,

(max 1.0 1) ==> 1.

Another language closely related to Prolog (Prolog -> Concurrent

Prolog -> Strand 88 -> Erlang) is Erlang. Erlang doesn't have a

max(X,Y) or min(X,Y) function but uses lists:max(List) and

lists:min(List). How do they work? Just like Python: if there

are several elements equal to the extreme, the first of them is

returned. Here are the entirely unsurprising definitions:

min([H|T]) -> min(T, H).

min([H|T], Min) when H < Min -> min(T, H);

min([_|T], Min) -> min(T, Min);

min([], Min) -> Min.

max([H|T]) -> max(T, H).

max([H|T], Max) when H > Max -> max(T, H);

max([_|T], Max) -> max(T, Max);

max([], Max) -> Max.

The Prolog equivalents would be

min([H|T], Min) :- min(T, H, Min).

min([], Min, Min).

min([H|T], Min0, Min) :-

( H < Min0 -> Min1 = H ; Min1 = Min0 ),

min(T, Min1, Min).

max([H|T], Max) :- max(T, H, Max).

max([], Max, Max).

max([H|T], Max0, Max) :-

( H < Max0 -> Max1 = H ; Max1 = Max0 ),

max(T, Max1, Max).

All things considered, then, "least surprise" is an unreliable guide:

- if we are afraid of surprising C programmers, we should do floating

point contagion, whereas

- if we are afraid of surprising programmers used to Prolog-like

languages, we should NOT do floating point contagion.

{Completeness}

If a computation has a result which can be represented, it should

deliver that result.

This is what Scheme violates.

(let ((X (+ (expt 2 55) 1))

(Y (expt 2.0 55)))

(max X Y))

The larger of the two numbers is obviously X. It cannot be converted

to an (IEEE double) inexact number without loss of accuracy. So although

there *is* an answer, and the system *knows* the answer, and the system

*could* represent the answer, it is none-the-less allowed to sneer at

you and say "WON'T".

What's really going on here is that the rule of floating-point contagion

goes back to Fortran, where the assumption was that integers and floats

were the same size. In C89, converting a 'long' (typically 32 bits) to

a 'double' (typically 64 bits) is completely safe on all the familiar

platforms. So nobody realises that floating-point contagion doesn't

really make sense in a world where integers may not only have higher

precision that floats but even wider range.

Someone, I forget who, has already pointed out that this can cause

problems in SICStus.

It really is a shocking surprise if X is a representable number and

Y is a representable number and max(X, Y) is not; we can understand

it for + and * which can deliver bigger outputs than inputs, but that

is the one thing max/2 cannot do. So I think we have to say that

loss of Completeness also counts as violating Least Surprise.

{Consistency}

Life gets simpler for everyone thinking about code if the operations

satisfy appropriate axioms, especially when there already ARE well

known axioms for the operations. In particular, maximum and minimum

are very well known lattice operations which are associative,

commutative, and idempotent. I shall examine max/2 only; the results

carry over mutatis mutandis to min/2.

{max is Idempotent}

For all numbers X : X is max(X, X).

{max is Commutative}

For all numbers X, Y : if A is max(X, Y) and B is max(Y, X)

then A == B.

To my mind this is the strongest argument for floating point

contagion, because it does ensure this. It is not true in

Erlang, Python, Smalltalk, or CMU Common Lisp, but then in

none of those languages is 'max' a binary function. But

for Prolog, it is clear that a rule "return the first argument

when there is a tie" or "return the second argument when there

is a tie" won't do.

However, there is a better rule which preserves the "pure selection"

reading of max/2 and IS commutative:

max(X, Y) = if X > Y then X else

if X < Y then Y else

if X is more precise than Y then X else Y

where (exact) integers and rational numbers are maximally

precise, and the precision of floating-point numbers is

determined by the number of digits in the significand.

{max is Associative}

For all numbers X, Y, Z : if A is max(max(X, Y), Z) and

B is max(X, max(Y, Z)) then A == B.

I _believe_ that this does not hold in the presence of

floating-point contagion, but I just thought of this argument

and have to leave in a few minutes. A proof that I am wrong

would be an excellent defence of floating-point contagion here.

{max is Consistent with Ordering}

Unfortunately, the ISO Prolog standard gets mixed mode

comparison disastrously wrong. One likely reason for this

is that the right definition requires acting AS IF floats

were converted to rationals, and ISO Prolog omits rationals.

The other likely reason is simple failure to think it through.

Common Lisp very nearly applied floating-point contagion to

comparison. I was one of the people who pointed out that this

meant that Common Lisp's < would not be transitive, and indeed,

ISO Prolog's =:= IS NOT TRANSITIVE. That is, it is not only

*possible*, it is *easy* to find numbers X Y Z such that

X =:= Y and Y =:= Z and X < Z.

This is not the stuff out of which reliable programs can be made,

and it is all due to blindly transferring the Fortran/C model

of floating-point contagion inappropriately. (And yes, this does

mean that you can find three numbers X Y Z such that

X .EQ. Y .AND. Y .EQ. Z .AND X .LT. Z in Fortran and

X == Y && Y == Z && X < Z in C. Just because they got it wrong

is no reason why we should have.)

I was going to explain how applying floating-point contagion to

max breaks the combination of max and numeric comparison, but

with numeric comparison so badly broken already, it hardly makes

any difference, does it?

For another example of how ISO Prolog numeric comparison is broken,

consider this. It is possible to find Prolog numbers X Y such that

X < Y,

X-1 < X,

but

X-1 < Y

raises an exception. Is there any possibility at all of fixing

that? It is difficult to imagine anyone relying on Prolog getting

numeric comparison *wrong*, after all.

Joachim asked:

I suppose by "select" you mean that the result must be _identical_

to one of the inputs, rather than only _numerically_equal_?

Actually, in the presence of floating-point contagion, it is possible

that the result is *NOT* numerically equal to one of the inputs, or it

would be if Prolog's numeric comparisons weren't broken.

One reason is IEEE arithmetic, as discussed above, where numeric equality

simply won't do as a criterion even when both arguments are floats.

Why is this desirable, when it leads to the problems indicated?

What problems?

After all, this is an arithmetic function,

That is what we call "Begging the question".

*IS* max/2 just another "arithmetic function" in the sense required?

To my mind, it clearly isn't. The result is one of the arguments,

not a *combination* of information from the separate arguments,

UNLESS you decide to MAKE it so.

so there is no reason

why it should not use the same floating-point contagion rule

like all other symmetric binary operations,

What does symmetry have to do with it? (-)/2 is not symmetric, but

does apply contagion. I say that max/2 is so different from the arithmetic

operations (and it clearly ISN'T an arithmetic function in Python or

Smalltalk or Erlang) that there is no apparent reason why it SHOULD,

and good reasons why it should not.

In typed programming languages like Fortran and C, floating-point

contagion is applied to max for the sole reason that the result must

have SOME definite compile-time type. That reason does not apply to

Prolog.

On the other hand, if you want select semantics, you have to decide

what to return if the arguments are numerically equal. I don't

see any reason why standard term order should play any role here,

Neither do I.

or why the arguments order should be relevant (as seems to be the

case in an old SWI 5.6.9 I have here).

In fact I have just given an argument why it should NOT be relevant.

In the case of equality, the result should be the more precise of the

arguments, and they can only be equally precise if they are the same

term. The ordering required has to be a little bit more sensitive than

numeric < because -0.0 must be distinguished from +0.0.

_______________________________________________

prolog-standard mailing list

prolog-standard_at_neve.di.ubi.pt

http://neve.di.ubi.pt/mailman/listinfo/prolog-standard

Received on Mon Jul 14 2008 - 13:01:02 EST

*
This archive was generated by hypermail 2.2.0
: Wed Sep 08 2010 - 23:28:17 EST
*