Writing Efficient Code

6.7 Writing Efficient Code

Even with a declarative language, there are certain constructs which can be compiled more efficiently than others. It is however not recommended to write unreadable code with the aim of achieving faster execution - intuition is often wrong about which particular construct will execute more efficiently in the end. The advice is therefore

Try the simple and straightforward solution first!

This will keep code maintainable, and will often be as fast or marginally slower than elaborate tricks. The second rule is to keep this original program even if you try to optimise it. You may find out that the optimisation was not worth the effort. ECLⁱPS^e provides some support for finding those program parts that are worth optimizing.

To achieve the maximum speed of your programs, choose the following compiler options:

debug:off ;
opt_level:1 (the default);
expand:on (the default).

Some programs spend a lot of time in the garbage collection, collecting the stacks and/or the dictionary. If the space is known to be deallocated anyway, e.g., on failure, the programs can be often sped up considerably by switching the garbage collector off or by increasing the gc_interval flag. As the global stack expands automatically, this does not cause any stack overflow, but it may of course exhaust the machine memory.

When the program is running and its speed is still not satisfactory, use the profiling tools. The profiler can tell you which predicates are the most expensive ones, and the statistics tool tells you why. A program may spend its time in a predicate because the predicate itself is very time consuming, or because it was frequently executed. The port profiling tool gives you this information. It can also tell whether the predicate was slow because it has created a choice point or because there was too much backtracking due to bad indexing.

One of the very important points is the selection of the clause that matches the current call. If there is only one clause that can potentially match, the compiler is expected to recognise this and generate code that will directly execute the right clause instead of trying several subsequent clauses until the matching one is found. Unlike most of the current Prolog compilers, the ECLⁱPS^e compiler tries to base this selection (indexing) on the most suitable argument of the predicate.¹ It is therefore not necessary to reorder the predicate arguments so that the first one is the crucial argument for indexing. For example, in a predicate like

p(a, a) :- a.
p(b, a) :- b.
p(a, b) :- c.
p(d, b) :- d.
p(b, c) :- e.

calls where the first argument is instantiated, like p(d,Y), will be indexed on the first argument, while calls where the second argument is instantiated, like p(X,b), will be indexed on the second.

However, the decision is still based on only one argument at a time: a call like p(d,b) will be indexed on the first argument only (not because it is the first, but because it is more discriminating than the second). If it is crucial that such a procedure is executed as fast as possible with such a calling pattern, it can help to define an auxiliary procedure which will be indexed on the other argument:

p(X, a) :- pa(X).
p(X, b) :- pb(X).
p(b, c) :- e.

pa(a) :- a. pa(b) :- b.

pb(a) :- c. pb(d) :- d.

The compiler also tries to use for indexing all type-testing information that appears at the beginning of the clause body (or beginning of a disjunction):

Type testing predicates, i.e., free/1 , var/1 , meta/1 , atom/1 , integer/1 , rational/1 , float/1 , breal/1 , real/1 , number/1 , string/1 , atomic/1 , compound/1 , nonvar/1 and nonground/1 .
Explicit unification and value testing =/2 , ==/2 , \==/2 and \=/2 .
Combinations of tests with ,/2 , ;/2 , not/1 , −>/2 .
A cut after the type tests.

If the compiler can decide about the clause selection at compile time, the type tests are never executed and thus they incur no overhead. When the clauses are not disjoint because of the type tests, either a cut after the test or more tests into the other clauses can be added. For example, the following procedure will be recognised as deterministic and all tests are optimised away:

    % a procedure without cuts
    p(X) :- var(X), ...
    p(X) :- (atom(X); integer(X)), X \= [], ...
    p(X) :- nonvar(X), X = [_|_], ...
    p(X) :- nonvar(X), X = [], ...

Another example:

    % A procedure with cuts
    p(X{_}) ?- !, ...
    p(X) :- var(X), !, ...
    p(X) :- integer(X), ...
    p(X) :- real(X), ...
    p([H|T]) :- ...
    p([]) :- ...

Here are some more hints for efficient coding with ECLⁱPS^e:

Arguments which are repeated in the clause head and in the first regular goal in the body do not require any data moving and thus they do not cost anything. For example,
```
p(X, Y, Z, T, U) :- q(X, Y, Z, T, U).
```
is just as cheap as
```
p :- q.
```
On the other hand, switching arguments requires data moves and so
```
p(A, B, C) :- q(B, C, A).
```
is somewhat more expensive.
When accessing an argument of a structure whose functor is known, unification and arg/3 are both similarly efficient, so the question of whether to write Struct = emp(_, X, _) or arg(2, Struct, X) is just a matter of taste and style.
We recommend that the structure notation (see section 5.1) be used, as it improves readability without adding any overhead. So, for example, use Struct = emp{salary:X} or arg(salary of emp, Struct, X).
Tests are generally rather slow unless they can be compiled away (see indexing).
Waking is more expensive (due to the priority mechanism) than metacalling which is more expensive than compiled calls. Metacalls however do not carry as heavy a penalty as in some other Prolog systems.
Sorting using sort/2 is very efficient and it does not use much space. Using setof/3 , findall/3 etc. is also efficient enough to be used every time a list of all solutions is needed.
=/2 and ==/2 are faster than =:=/2 .
:/2 is optimised away by the compiler if both arguments are known.
Starting from ECLⁱPS^e 6.0, there is no performance difference between using multiple clauses or using disjunction or if-then-else cascades. In fact, the compiler normalises multiple clause predicates into a single-clause representation with inline disjunctions. Disjunctions are indexed.
Conditionals (i.e., …->…;…) are compiled more efficiently if the condition is an indexable built-in test.

1: The standard approach is to index only on the first argument.