Re: [prolog-standard] Re: log10/1, log/2, min/2, and max/2

From: Richard A. O'Keefe <ok_at_cs.otago.ac.nz>
Date: Fri, 17 Nov 2006 15:29:15 +1300 (NZDT)

Joachim Schimpf <js10_at_crosscoreop.com> asks
    [Why NOT apply floating-point contagion to max/2 and min/2?]

Perhaps we can set some ground rules from which we can deduce an
answer, and if that goes against my preferences, sobeit. All the rules
are "defeasible"; preferences rather than rigid requirements.

{Definiteness}
    Arithmetic operations are not Prolog predicates. They are supposed to
    be pure functions. For given inputs, either there should be a common
    error, or there should be a good approximation to a common exact result.

    This means that we would like to have a common specific answer to
    max(X, Y), even when X =:= Y.

    HOWEVER, floating-point contagion is NOT sufficient to satisfy this
    preference. According to IEEE rules, -0.0 =:= +0.0, so if we ask
    for max(-0.0, +0.0), which do we get? The answers *can* be told
    apart, and since the arguments are both floating-point numbers,
    floating-point contagion does nothing to answer the question.

    This means that if we do adopt floating-point contagion for max/2
    and min/2, we *still* need another rule to fix this problem, and
    standard term order (which I hope orders -0.0 before +0.0; that's
    really the only sensible thing it can do) is the obvious candidate.

{Least Surprise}
    When we have a choice of definitions, we should adopt that one which
    is least likely to mislead programmers. If programmers expect
    floating-point contagion, and don't get it, they will be confused.

    Those statically typed compiled languages which allow mixed-mode
    arithmetic (and a thousand blessings on Ada for *not* allowing it;
    causes more trouble than it's worth) DO apply a floating-point
    contagion rule. Typically this is done indirectly. For example,
    java.lang.Math has
        public static int max(int a, int b);
        public static long max(long a, long b);
        public static float max(float a, float b);
        public static double max(double a, double b);
    That looks as though mixed mode isn't allowed, but the overload
    resolution rules for Java allow implicit conversions to take place,
    and the effect of contagion is thus produced.

    C gets it via a "balancing" rule for conditional expressions:
        #define max(x, y) ((x) < (y) ? (y) : (x))
    selects the original value, and then converts it to a common type.

    So Fortran, Algol 68, PL/I, Pascal, C, C++, and Java programmers
    will be surprised if floating-point contagion DOESN'T apply.

    On the other hand, what is Prolog doing in that galley?
    Prolog belongs to the class of interactive dynamic latently typed
    languages, such as Smalltalk, Python, and Scheme.
               5
    The Revised Report on Scheme says about max and min:
        If any argument is inexact, then the result will also be inexact
        (unless the procedure can prove that the inaccuracy is not large
        enough to affect the result, which is possible only in unusual
        implementations). If min or max is used to compare numbers of
        mixed exactness, and the numerical value of the result cannot be
        represented as an inexact number without loss of accuracy, then
        the procedure may report a violation of an implementation
        restriction.

    That looks to me uncommonly like the compromise from a disagreement
    between implementors who wanted pure selection and implementors who
    wanted contagion. I don't imagine that the result pleased anybody:
    the contagion-lovers cannot rely on getting it, the contagion-haters
    cannot rely on avoiding it, and when bignum meets flonum it might or
    might not kill the program. Someone who has read the Scheme standard
    attentively won't be surprised by *anything* we choose to do.

    On the other hand, Python and Smalltalk do NOT apply floating-point
    contagion. ANSI Smalltalk, section 5.6.1.6:
        (receiver) max: operand
        Answer the receiver if it is greater than operand.
        Answer operand otherwise.
    Section 5.6.1.7:
        (receiver) min: operand
        Answer the receiver if it is less than operand.
        Answer operand otherwise.

    Floating-point contagion is clearly forbidden to apply to max: and
    min: in ANSI Smalltalk. If x and y are equal, this says that
    (x max: y) and (x min: y) should both return y. That's what happens.

    The Python documentation says
        Python fully supports mixed arithmetic: when a binary arithmetic
        operator has operands of different numeric types, the operand
        with the "narrower" type is widened to that of the other, where
        plain integer is narrower than long inger is narrower than
        floating point is narrower than complex. Comparisons between
        numbers of mixed type use the same rule.

    However, max and min are neither binary arithmetic operators nor
    comparisons. When x == y, max(x, y) and min(x, y) both return x.

    CMU Common Lisp acts just like Python here: (max 1 1.0) ==> 1,
    (max 1.0 1) ==> 1.

    Another language closely related to Prolog (Prolog -> Concurrent
    Prolog -> Strand 88 -> Erlang) is Erlang. Erlang doesn't have a
    max(X,Y) or min(X,Y) function but uses lists:max(List) and
    lists:min(List). How do they work? Just like Python: if there
    are several elements equal to the extreme, the first of them is
    returned. Here are the entirely unsurprising definitions:

        min([H|T]) -> min(T, H).

        min([H|T], Min) when H < Min -> min(T, H);
        min([_|T], Min) -> min(T, Min);
        min([], Min) -> Min.

        max([H|T]) -> max(T, H).

        max([H|T], Max) when H > Max -> max(T, H);
        max([_|T], Max) -> max(T, Max);
        max([], Max) -> Max.

    The Prolog equivalents would be

        min([H|T], Min) :- min(T, H, Min).

        min([], Min, Min).
        min([H|T], Min0, Min) :-
            ( H < Min0 -> Min1 = H ; Min1 = Min0 ),
            min(T, Min1, Min).

        max([H|T], Max) :- max(T, H, Max).

        max([], Max, Max).
        max([H|T], Max0, Max) :-
            ( H < Max0 -> Max1 = H ; Max1 = Max0 ),
            max(T, Max1, Max).

    All things considered, then, "least surprise" is an unreliable guide:
    - if we are afraid of surprising C programmers, we should do floating
      point contagion, whereas
    - if we are afraid of surprising programmers used to Prolog-like
      languages, we should NOT do floating point contagion.

{Completeness}
    If a computation has a result which can be represented, it should
    deliver that result.

    This is what Scheme violates.

    (let ((X (+ (expt 2 55) 1))
          (Y (expt 2.0 55)))
      (max X Y))

    The larger of the two numbers is obviously X. It cannot be converted
    to an (IEEE double) inexact number without loss of accuracy. So although
    there *is* an answer, and the system *knows* the answer, and the system
    *could* represent the answer, it is none-the-less allowed to sneer at
    you and say "WON'T".

    What's really going on here is that the rule of floating-point contagion
    goes back to Fortran, where the assumption was that integers and floats
    were the same size. In C89, converting a 'long' (typically 32 bits) to
    a 'double' (typically 64 bits) is completely safe on all the familiar
    platforms. So nobody realises that floating-point contagion doesn't
    really make sense in a world where integers may not only have higher
    precision that floats but even wider range.

    Someone, I forget who, has already pointed out that this can cause
    problems in SICStus.

    It really is a shocking surprise if X is a representable number and
    Y is a representable number and max(X, Y) is not; we can understand
    it for + and * which can deliver bigger outputs than inputs, but that
    is the one thing max/2 cannot do. So I think we have to say that
    loss of Completeness also counts as violating Least Surprise.

{Consistency}

    Life gets simpler for everyone thinking about code if the operations
    satisfy appropriate axioms, especially when there already ARE well
    known axioms for the operations. In particular, maximum and minimum
    are very well known lattice operations which are associative,
    commutative, and idempotent. I shall examine max/2 only; the results
    carry over mutatis mutandis to min/2.

    {max is Idempotent}

        For all numbers X : X is max(X, X).

    {max is Commutative}

        For all numbers X, Y : if A is max(X, Y) and B is max(Y, X)
        then A == B.

        To my mind this is the strongest argument for floating point
        contagion, because it does ensure this. It is not true in
        Erlang, Python, Smalltalk, or CMU Common Lisp, but then in
        none of those languages is 'max' a binary function. But
        for Prolog, it is clear that a rule "return the first argument
        when there is a tie" or "return the second argument when there
        is a tie" won't do.

        However, there is a better rule which preserves the "pure selection"
        reading of max/2 and IS commutative:

        max(X, Y) = if X > Y then X else
                    if X < Y then Y else
                    if X is more precise than Y then X else Y

        where (exact) integers and rational numbers are maximally
        precise, and the precision of floating-point numbers is
        determined by the number of digits in the significand.

    {max is Associative}

        For all numbers X, Y, Z : if A is max(max(X, Y), Z) and
        B is max(X, max(Y, Z)) then A == B.

        I _believe_ that this does not hold in the presence of
        floating-point contagion, but I just thought of this argument
        and have to leave in a few minutes. A proof that I am wrong
        would be an excellent defence of floating-point contagion here.

    {max is Consistent with Ordering}

        Unfortunately, the ISO Prolog standard gets mixed mode
        comparison disastrously wrong. One likely reason for this
        is that the right definition requires acting AS IF floats
        were converted to rationals, and ISO Prolog omits rationals.
        The other likely reason is simple failure to think it through.
        Common Lisp very nearly applied floating-point contagion to
        comparison. I was one of the people who pointed out that this
        meant that Common Lisp's < would not be transitive, and indeed,
        ISO Prolog's =:= IS NOT TRANSITIVE. That is, it is not only
        *possible*, it is *easy* to find numbers X Y Z such that
            X =:= Y and Y =:= Z and X < Z.
        This is not the stuff out of which reliable programs can be made,
        and it is all due to blindly transferring the Fortran/C model
        of floating-point contagion inappropriately. (And yes, this does
        mean that you can find three numbers X Y Z such that
        X .EQ. Y .AND. Y .EQ. Z .AND X .LT. Z in Fortran and
        X == Y && Y == Z && X < Z in C. Just because they got it wrong
        is no reason why we should have.)

        I was going to explain how applying floating-point contagion to
        max breaks the combination of max and numeric comparison, but
        with numeric comparison so badly broken already, it hardly makes
        any difference, does it?

        For another example of how ISO Prolog numeric comparison is broken,
        consider this. It is possible to find Prolog numbers X Y such that
            X < Y,
            X-1 < X,
        but
            X-1 < Y
        raises an exception. Is there any possibility at all of fixing
        that? It is difficult to imagine anyone relying on Prolog getting
        numeric comparison *wrong*, after all.

Joachim asked:

        I suppose by "select" you mean that the result must be _identical_
        to one of the inputs, rather than only _numerically_equal_?
        
Actually, in the presence of floating-point contagion, it is possible
that the result is *NOT* numerically equal to one of the inputs, or it
would be if Prolog's numeric comparisons weren't broken.

One reason is IEEE arithmetic, as discussed above, where numeric equality
simply won't do as a criterion even when both arguments are floats.

        Why is this desirable, when it leads to the problems indicated?

What problems?

        After all, this is an arithmetic function,

That is what we call "Begging the question".
*IS* max/2 just another "arithmetic function" in the sense required?
To my mind, it clearly isn't. The result is one of the arguments,
not a *combination* of information from the separate arguments,
UNLESS you decide to MAKE it so.

        so there is no reason
        why it should not use the same floating-point contagion rule
        like all other symmetric binary operations,

What does symmetry have to do with it? (-)/2 is not symmetric, but
does apply contagion. I say that max/2 is so different from the arithmetic
operations (and it clearly ISN'T an arithmetic function in Python or
Smalltalk or Erlang) that there is no apparent reason why it SHOULD,
and good reasons why it should not.

In typed programming languages like Fortran and C, floating-point
contagion is applied to max for the sole reason that the result must
have SOME definite compile-time type. That reason does not apply to
Prolog.

        On the other hand, if you want select semantics, you have to decide
        what to return if the arguments are numerically equal. I don't
        see any reason why standard term order should play any role here,

Neither do I.

        or why the arguments order should be relevant (as seems to be the
        case in an old SWI 5.6.9 I have here).
        
In fact I have just given an argument why it should NOT be relevant.

In the case of equality, the result should be the more precise of the
arguments, and they can only be equally precise if they are the same
term. The ordering required has to be a little bit more sensitive than
numeric < because -0.0 must be distinguished from +0.0.
        
_______________________________________________
prolog-standard mailing list
prolog-standard_at_neve.di.ubi.pt
http://neve.di.ubi.pt/mailman/listinfo/prolog-standard
Received on Mon Jul 14 2008 - 13:01:02 EST

This archive was generated by hypermail 2.2.0 : Wed Sep 08 2010 - 23:28:17 EST