Re: [eclipse-clp-users] Question about parsing

From: Joachim Schimpf <jschimpf_at_users.sf.net>
Date: Wed, 07 Jan 2009 00:32:33 +1100
Huy Pham wrote:
> Hello,
> 
> I am writing a Prolog parser in ECLiPSe Prolog to parse a mini language.
 >
 > The difficulty I am having is that because Prolog expects term-like
 > expressions, my parser can not parse
 > non-term-like expressions.

There are two main alternatives:

1. Reuse the built-in ECLiPSe parser.  This means using one of the builtins
of the read/2 family, and configuring the syntax setting such that it accepts
your new language.  The ECLiPSe parser is extremely configurable (on a
per-module basis) via

- operator declarations, including binary prefix operators, see
   http://eclipse-clp.org/doc/bips/kernel/syntax/op-3.html

- syntax options, see
   http://eclipse-clp.org/doc/userman/umsroot144.html

- character class settings, see
   http://eclipse-clp.org/doc/bips/kernel/syntax/set_chtab-2.html

Obviously this approach will work best when your target language is quite
close to Prolog anyway. The disadvantage is that the resulting parser will
probably not be very strict, i.e. it will also accept some illegal syntax,
and not give very helpful error messages.  Whether this matters, depends
on your intended application.


2. Write a parser using grammar rules (DCGs), see
http://eclipse-clp.org/doc/userman/umsroot069.html
In this case, you would typically provide a tokenizer which turns your
input into a list of tokens (you can use the read_token/3 builtin to
help with that), and then write grammar rules to process this list.
This is a clean, flexible solution with no restrictions on the language
you can parse, and the grammar rules should nicely reflect the formal
definition of your language.


I have used both alternatives to implement parsers for Flatzinc
(http://www.g12.csse.unimelb.edu.au/minizinc/specifications.html).
You can find approach (1) in the file flatzinc_syntax.ecl,
and approach (2) in the file flatzinc_parser.ecl in the lib_public
directory of your ECLiPSe installation, or at the source repository
http://eclipse-clp.cvs.sourceforge.net/viewvc/eclipse-clp/Eclipse/ZincInterface/


 > For example, the syntax for my language's IF
 > statement needs to be of the form:
 >
 >      if(Condition, ThenClause)
 >
 > Instead of
 >
 >      if (Condition) then ThenClause endif
 >
 > or better yet:
 >
 >      if (Condition) { ThenClause }

This particular case can be solved using a binary prefix
operator declaration for if/2:

?- op(1175, fxy, if).
Yes (0.00s cpu)

?- read(Term), display(Term).
if X>0 { writeln(hello) }.

if(>(X, 0), {}(writeln(hello)))


Such syntax settings should always be encapsulated in a separate module,
so they don't interfere with parsing your Eclipse code. See the flatzinc
example for how it is done.


 >
 > I know it's a naive question, but thought I should still ask in case
 > someone have some experience and could
 > give me some hints.


Hope this helps,
Joachim
Received on Tue Jan 06 2009 - 14:32:42 CET

This archive was generated by hypermail 2.2.0 : Thu Feb 02 2012 - 02:31:58 CET