Previous Up Next

A.2  Notation

The following notation is used in the syntax specification in this chapter:

A.2.1  Character Classes

The following character classes exist:

Character ClassNotation Used Default Members
 
upper_caseUCall upper case letters
underlineUL_
lower_caseLCall lower case letters
digitNdigits
blank_spaceBSspace, tab and nonprintable ASCII characters
end_of_lineNLline feed
atom_quoteAQ'
string_quoteSQ"
list_quoteLQ`
chars_quoteCQ 
radixRA 
asciiAS 
soloSL! ;
specialDS( [ { ) ] } , |
line_commentCM%
escapeES\
first_commentCM1/
second_commentCM2*
symbolSY# + - . : < = > ? @ ^ ~ $ &
terminatorTS 

The character class of any character can be modified by a chtab-declaration.

A.2.2  Groups of characters

 
Group TypeNotationValid Characters
 
alphanumericalALPUC UL LC N
non escapeNESany character except escape
signSGN+ -

A.2.3  Valid Tokens

Terms are defined in terms of tokens, and tokens are defined in terms of characters and character classes. Individual tokens can be read with the predicates read_token/2 and read_token/3. The description of the valid tokens follows.

Atoms

ATOM    = (LC ALP*)
        | (SY | CM1 | CM2 | ES)+
        | (AQ (NES | ESCSEQ)* AQ)
        | SL
        | []
        | {}
        | |

If the syntax option doubled_quote_is_quote is enabled, two immediately consecutive AQ characters may occur inside an AQ-quoted sequence, and will be interpreted as a single occurrence of the quote within the name. If the syntax option bar_is_no_atom is active, the vertical bar cannot be used as an atom, unless quoted.

Numbers

  1. integers
    INT = [SGN] N+
    
  2. based integers
    INTBAS = [SGN] N+ (AQ | RA) (N | LC | UC)+
    
    The base must be an integer between 1 and 36 included, the value being valid for this base.

    If the syntax option iso_base_prefix is active, the syntax for based integers is instead

    INTBAS = [SGN] 0 (b | o | x) (N | LC | UC)+
    

    which allows binary, octal and hexadecimal numbers respectively.

  3. character codes
    INTCHAR = [SGN] (0 (AQ|RA)|AS) CHARCONST
    
    For all plain characters, CHARCONST is just that character, and the value of the integer is the character code of that character. For special characters, see the detailed definition of CHARCONST below A.2.3.
  4. rationals
    RAT = [SGN] N+ UL N+
    
  5. floats
    FLOAT = [SGN] N+ . N+ [ (e | E) [SGN] N+ | Inf | NaN ]
          | [SGN] N+        (e | E) [SGN] N+
    
    If the syntax option float_needs_point is active, then only the first alternative (with floating point) is valid syntax.
  6. bounded reals
    BREAL = FLOAT UL UL FLOAT
    
    where the first float must be less or equal to the second.

If the syntax option blanks_after_sign is active, then blank space (BS*) is allowed between the sign and the following digits.

Strings

STRING = SQ (NES | ESCSEQ | SQ BS* SQ)* SQ

Text enclosed in SQ (string_quote) characters is parsed as a constant of type string. By default, the double quote " is the SQ character.

By default, consecutive strings are concatenated into a single string literal. This behaviour can be disabled by the syntax option no_string_concatenation. If the strings are consecutive without intervening blank space, the doubled_quote_is_quote causes the doubled quotes to be interpreted as a single occurrence of the quote within the string.

Lists of numeric character codes

LIST = LQ (NES | ESCSEQ | LQ BS* LQ)* LQ

Text enclosed in LQ (list_quote) characters is parsed as a list of numeric character codes. For example, if the double quote " is defined as list_quote, then "abc" is parsed as [97,98,99].

Concatenation and doubled quotes are handled as for SQ-quoted strings.

Lists of single-character atoms

LIST = CQ (NES | ESCSEQ | CQ BS* CQ)* CQ

Text enclosed in CQ (chars_quote) characters is parsed as a list of single-atom characters. For example, if the double quote " is defined as chars_quote, then "abc" is parsed as ['a','b','c'].

Concatenation and doubled quotes are handled as for SQ-quoted strings.

Variables

VAR = (UC | UL) ALP*

End of clause

EOCL = . (BS | NL | <end of file>) | TS | <end of file>

If the syntax option eof_is_no_fullstop is active, then end-of-file alone does not act as EOCL.

Escape Sequences within Quotes

Within quoted constants (atoms, strings, character lists), the following escape sequences ESCSEQ may occur, and lead to the corresponding special character being inserted into the quoted item.

ESCSEQ = ResultSyntax option
ES a ASCII alert (7)  
ES b ASCII backspace (8)  
ES f ASCII form feed (12)  
ES n ASCII newline (10)  
ES r ASCII carriage return (13)  
ES t ASCII tabulation (9)  
ES v ASCII vertical tab (11)  
ES e ASCII escape (27) not iso_restrictions
ES d ASCII delete (127) not iso_restrictions
ES s ASCII space (32) not iso_restrictions
ES (ES|AQ|SQ|LQ|CQ) the ES,AQ,SQ,LQ or CQ character 
ES NL ignored  
ES c (BS|NL)* ignored not iso_restrictions
ES three octal digits character with given octal character code not iso_escapes
ES octal digits ES character with given octal character code iso_escapes
ES x hex digits ES character with given hexadecimal character code  

It is illegal for any other character to follow the ES. If the syntax option iso_escapes is active, the octal escape sequence can be of any length and must be terminated with an ES character. Some sequences are disabled by the iso_restrictions option.

Character Constants

An integer character constant (see 3) is by default introduced by the sequence 0' and followed by CHARCONST, which is defined as one of the following:

CHARCONST = RepresentsSyntax option
(ALP|SL|DS|CM|CM1|CM2|SY|TS) that character 
<SPACE> ASCII space (32)  
(SQ|LQ|CQ) the SQ,LQ or CQ character 
ES (ES|AQ|SQ|LQ|CQ) the ES,AQ,SQ,LQ or CQ character 
ES a ASCII alert (7)  
ES b ASCII backspace (8)  
ES f ASCII form feed (12)  
ES n ASCII newline (10)  
ES r ASCII carriage return (13)  
ES t ASCII tabulation (9)  
ES v ASCII vertical tab (11)  
AQ AQ the AQ character itself iso_escapes and doubled_quote_is_quote
ES octal digits ES character with given octal character code iso_escapes
ES x hex digits ES character with given hexadecimal character code  
<TAB> ASCII tabulation (9) not iso_escapes
NL ASCII newline (10) not iso_escapes
AQ the AQ character itself not iso_escapes
ES the ES character itself not iso_escapes
ES e ASCII escape (27) not iso_restrictions
ES d ASCII delete (127) not iso_restrictions
ES s ASCII space (32) not iso_restrictions

It is recommended to use only those sequences that are recognised universally, i.e. independent of syntax option settings. The other sequences are present for compatibility with various Prolog dialects. The syntax options iso_escapes and iso_restrictions disable several of those. The AQ AQ sequence is of dubious value – it is recommended to write 0'\' instead of 0'''.


Previous Up Next