Recent Changes - Search:

PmWiki

pmwiki.org

edit SideBar

DCG

Definite Clause Grammars Harmonization Proposal

Last edit jschimpf? October 20, 2016, at 12:42 AM

We propose to de-facto-standardize Definite Clause Grammars in Prolog based on a reference implementation.

Abbreviations used in the following:

  • Systems/definitions: ISOD (ISO DCG draft), SP (SICStus Prolog 4.3), ECL (ECLiPSe 6.1), SWI (SWI Prolog 7.2).
  • People: JS (Joachim Schimpf), PM (Per Mildner), JW (Jan Wielemaker).

Suggested reference implementation (=specification):

Testing:

Other references:

Background: Discussion of Specific Issues


List as grammar head

SP and ISOD treat this as nonterminal mapping to ./4:

912: [t]-->b
 '.'(t,[],A,B):-b(A,B)

SWI makes an error.

SUGGESTED RESOLUTION: error.

COMMENT JS: I think the nonterminal interpretation is a mis-feature that exists by accident. In the context of DCGs a programmer expects a list to stand for a terminal sequence, and it is very unlikely that interpretation as a nonterminal is what the programmer wanted in the example. Even worse, when a pushback list is involved:

[a,b], [c,d] --> [e,f].

even the two lists in the head mean different things: [a,b] is a nonterminal mapping to ./4, while [c,d] is a terminal sequence (like [e,f] in the body).

COMMENT PM: Here SP treats ‘.’/2 as nothing special. I do not think we would object to changing SP to give an error. On the other hand, once we accept that —> is just a different syntax for ordinary clauses, it makes some sense that one would be allowed to define ‘.’/4 in this way, even though it can not be called using the DCG syntax ‘.’(…,…).


Partial lists in body

SWI allows partial lists:

109: p-->[a,b|T],q(T)
 p(A,B):-'$append'([a,b|T],D,A),q(T,D,B)

SP, ECL, ISOD don't support this.

SUGGESTED RESOLUTION: do not allow.

COMMENT JS: This strikes me as a mis-feature. The semantics isn't particularly intuitive, at least not to me. And it can be more clearly be expressed as

p--> [a,b], terminals(T), q(T).
terminals([]) --> [].
terminals([C|Cs]) --> [C], terminals(Cs).

Partial list as pushback

SWI also allows a partial list as pushback:

 p,[t|T]-->b(T)
 p(A,B):-b(T,A,C),'$append'([t|T],C,B)

SP, ECL, ISOD don't support this.

SUGGESTED RESOLUTION: do not allow

COMMENT JS: Not sure about this, would like to hear opinions. Same effect can be achieved with:

p --> b(T), pb([t|T]).
pb([]) --> [].
pb([C|Cs]), [C] --> pb(Cs).

Treatment of {} in body

SWI recognizes {} as equivalent to {true}, rather than as nonterminal mapping to {}/2. Not done in ISOD, SP, ECL, where it maps to {}/2.

SUGGESTED RESOLUTION: adopt SWI behaviour.

COMMENT JS: I think this is a good idea, corresponding to programmer intuition.

COMMENT PM: SICStus does not treat {}/0 specially in DCGs, and there is no built-in {}/2, so this will crash at runtime. I do not think {}/0 is traditional, but it certainly makes sense even though it may not be very useful.


Vertical bar syntax

Unfortunately, this is a bit of a mess.

  • C-Prolog tradition maps a|b to a;b, and this is still done by some systems (e.g. SP) under certain circumstances. In these systems the bar behaves syntactically like an xfy operator with precedence 1100.
  • other systems (e.g. ECL) have always had op(1100,xfy.'|')
  • some systems (e.g. SWI) have adopted op(1105,xfy,'|')

SUGGESTED RESOLUTION: Vertical bar must behave syntactically like an xfy infix operator of precedence between 1100 and 1109, we do not prescribe more than that. As a control construct on the right hand side of grammar rules, '|'(A,B) behaves exactly like ';'(A,B).

COMMENT JS: I have in version 0.1-0.2 treated (a->b|c) as NOT equivalent to (a->b;c). The intention was not to accidentally introduce an if-then-else where none was intended. But as it is desirable to accommodate systems that map a|b to a;b, treating them as equivalent is preferable.


Control constructs and meta-predicates

Background on which constructs are recognized by existing systems/definitions:

 _,_ _;_ ! _->_;__->__|_\+_once(_) _^_ if/3do/2
OKeefe 1988yesyes;nonono
QPyesyes;nonono
SPyesyes;yesyesyes
SWIyesyesyesyesnono
ECLyesyesyesnonoyes
ISOD 2015yesimpdefyesimpdefnono

Nobody seems to specially treat catch/3, findall/3, true/0.

It must be possible for implementations to support additional constructs, the question is just what minimum to require from a standard.

SUGGESTED RESOLUTION: Require only the traditionally undisputed control constructs: ,/2, ;/2, if-then-else, |/2, ->/2, !/0. Systems can implement more, but a grammar using anything else is not portable.

COMMENT JS: I was inclined to include \+/1, but Attach:okeefe20140605.txt indicates that this is too controversial.


Restrictions on the left-hand side

ExampleSPSWIISOD
3 --> ...undisputed error
(p,q) --> ...error because conflict with pushback notation
[] --> ...okerrorok
[a] --> ...okerrorok
{} --> ...okerrorok
{a} --> ...okerrorok
(p;q) --> ...okerrorok
! --> ...okerrorok
arg(X) --> ...not caught by dcg expansion, but leads to built-in redefinition error

SUGGESTED RESOLUTION: adopt the SWI behaviour.

COMMENT JS: I think it is reasonable to make an error for trying to define a nonterminal that cannot be used on a right-hand-side, as it is most likely a programmer error. Leaving the behaviour implementation-define is a bit pointless because it still means that grammars using such nonterminals are not portable.


Module qualification

Module qualification m:p as a normal Prolog goal has different semantics in different module systems, e.g.

  • m is lookup module for the definition of p AND calling context of p (Quintus-style systems such as SP and SWI)
  • m is only lookup module for p, does not affect calling context (ECL and ISO-modules with colon_sets_calling_context=false)
  • in ECL, a goal m:once(p) is usually nonsensical, since it would mean that an alternative definition of once/1 is looked up in module m.
 In particular, it is very different from once(m:p), i.e. the module must not be propagated inward.

If there is a single translation that works with both semantics, it should be preferred.

DCGConceivable TranslationQuintus etcECLiPSe etccompatible
m:pm:p()okok+
m:(p,q)m:(p(),q())okmeaningless+
m:(p,q)(m:p(),m:q())okwrong-
m:Xm:phrase(X,...)okwrong-
m:Xphrase(m:X,...)okok+

Module-qualifying a terminal-list makes no sense. This should either make an error, or the module silently ignored:

DCGConceivable TranslationQuintus etcECLiPSe etccompatible
m:[a]S0=[a|S]okok+
m:[a]errorokok+
m:[]S0=Sokok+
m:[]errorokok+

The meaning of module-qualifying the curly braces must be defined, it does not naturally follow from anything else. In an ECLiPSe-style system it is meaningless. In a Quintus-style system it is natural to propagate the module inward.

DCGConceivable TranslationQuintus etcECLiPSe etccompatible
m:{p}m:p()expectedunexpected-
m:{p}p()unexpectedpointless-
m:{p}errorokok+
m:{}m:trueokunexpected-
m:{}trueokpointless+
m:{}errorokok+

SUGGESTED RESOLUTION:

DCGPortable translation
m:pm:p()
m:(p,q)m:(p(),q())
m:Xphrase(m:X,...)
m:[a]error
m:[]error
m:{p}error
m:{}error

COMMENT PM: (...) it would be nice to mention modules in a way that does not conflict with ISO modules but also not with real module systems.

Edit - History - Print - Recent Changes - Search
Page last modified on August 10, 2018, at 04:41 PM