Recent Changes - Search:


edit SideBar


Strings in ECLiPSe 6.2, SWI-7 and YAP

Joachim Schimpf, 2013-11-28, 2013-12-07, 2013-12-26, 2014-07-11


ECLiPSe (and it precedessor Sepia) has always had the string data type (which was part of early BSI standard drafts) with double-quote syntax. SWI also had strings, but up to version 6 not with double quote syntax. With SWI-7 and ECLiPSe 6.2 string support has been harmonized, and YAP is expected to agree as well. The following is a summary of the common functionality, and a record of the related discussion.

Agreed Common Functionality


  • strings are double-quoted by default

Term order

Strings fall between numbers and atoms:

?- sort([1,1.2,a,"a",X,f(a)], S).
S = [X, 1.2, 1, "a", a, f(a)]

Intuition: strings have a "more compound" flavour than numbers, but atoms and compounds must remain consecutive because atoms may be considered as compound terms with arity 0.

String-related builtins

string(?Term) is semidet

    succeeds iff Term is a string

string_length(+String, -Length) is det

    where String is of type string.

string_code(?Index, +String, ?Code) is nondet

    Index from 1 to length of String.
    Domain errors on Index and Code if negative.
    Character codes like ISO char_code/2.

get_string_code(+Index, +String, -Code) is det

    like string_code/3, but deterministic and
    strict 1..N domain checking on Index.

string_char(?Index, +String, ?Char) is nondet

    analogous to string_code/3.

string_codes(?String, ?Codes)

    analogous to ISO atom_codes/2.

string_chars(?String, ?Chars)

    analogous to ISO atom_chars/2.

string_lower(+String, -Lower) is det
string_upper(+String, -Upper) is det

    Convert String to all lower or all upper case.

atom_string(+Atom, -String) is det
atom_string(-Atom, +String) is det

    where Atom is of type atom, and String of type string.

number_string(+Number, -String) is det
number_string(-Number, +String) is semidet

    Conversion between any type of number and a string.
    Fails if String can't be parsed as a number. The number syntax does not allow
    for leading or trailing spaces, nor for spaces between sign and digits.
    Both + and - are allowed as signs.  Comments etc are not allowed.

string_concat(?String1, ?String2, ?String3) is nondet

    analogous to ISO atom_concat/3 and previous ECLiPSe append_strings/3.

sub_string(+String, ?Before, ?Length, ?After, ?Sub) is nondet

    analogous to ISO sub_atom/5, and identical to ECLiPSe substring/5.

atomics_to_string(+Atomics, -String) is det

    concat list of atomic terms. Identical to previous ECLiPSe concat_string/2.

atomics_to_string(+Atomics, +Glue, -String) is det

    concat list of atomic terms, with glue between. Identical to previous ECLiPSe join_string/2.

split_string(+String, +SepChars, +PadChars, -SubStrings) is det

    as in ECLiPSe

term_string(+Term, -String) is det
term_string(-Term, +String) is det

    If String was uninstantiated, it is bound to a string representation of
    Term as produced by writeq/2.  If String was instantiated, it is parsed
    as with read/2 and the resulting term unified with Term.

term_string(+Term, -String, +Options) is det
term_string(-Term, +String, +Options) is det

    If String was uninstantiated, it is bound to a string representation of
    Term as produced by write_term/3 (with options corresponding to writeq/2,
    and in addition, and potentially overridden by, the given options).
    If String was instantiated, it is parsed as with read_term/3 with the given options,
    and the resulting term unified with Term.  Inapplicable options are ignored.

text_to_string(+Text, -String) is det

    Converts different textual representations into a string.
    Text is either an atom, string, list of character codes (codes), or
    list of single-character atoms (chars).  Text==[] gives String="".

read_string(+Stream, +Length, -String)
read_string(+Stream, -Length, -String)

    If Length is given, read Length characters from Stream into String.
    Otherwise, read until end of stream, and bind Length to the number
    of characters read.

read_string(+Stream, +SepChars, +PadChars, -Sep, -String)

    Read a string from Stream, providing functionality similar to split_string/4.
    The predicate performs the following steps:
     * Skip all characters that match PadChars
     * Read up to a character that matches SepChars or end of file
     * Discard trailing characters that match PadChars from the collected input
     * Unify String with a string created from the input and Sep with the
       separator character read. If input was terminated by the end of the
       input, Sep is unified with -1.
    The predicate read_string/5 called repeatedly on an input until Sep is -1
    (end of file) is equivalent to reading the entire file into a string and
    calling split_string/4, provided that SepChars and PadChars are not
    partially overlapping (which would require lookahead and could cause
    unexpected blocking read).

Note regarding mode notation: where mode '-' is specified, mode '+' is also allowed and affects the determinism class accordingly.

Situation before December 2013

ECL Syntax

  • strings are double-quoted, but back-quoted in when using iso-syntax (or, more precisely, when the the character classes are set accordingly)
  • ECL supports string token concatenation, ie. "a" "b"=="ab"
  • by default in ECL also "a""b"=="ab" rather than "a""b"=="a\"b" (negotiable)

Builtins previously in both ECL and SWI


    Term is a string

atom_string(?Atom, ?String)

    but SWI allows numbers as Atom in (+,-) mode, and numbers as String in (-,+) mode

string_length(+String, -Length)

    but SWI allows atoms and numbers as String

string_code(+String, +Index, ?Code) ECL
string_code(?Index, +String, ?Code) SWI

    This is very unfortunate!
    Different argument order, 0-based in SWI vs. 1-based Index in ECL,
    and nondeterministic reverse mode in SWI...
    In ECL, this is supposed to be a very fast primitive (like arg/3),
    it could even be implemented as an abstract machine instruction.
    What about renaming the nondet version string_member/3 or the like?

append_strings(?String1, ?String2, ?String3) ECL
string_concat(?String1, ?String2, ?String3) SWI

    Name is historical in ECL, could add alias.

substring(+String, ?Before, ?Length, ?After, ?Sub) ECL
sub_string(+String, ?Before, ?Length, ?After, ?Sub) SWI

    ECL ready to add underscore variant, in analogy to sub_atom/5.
    However, Quintus precedent is without underscore (and different
    argument order...)

number_string(?Number, ?String)

    Conversion between any number and a string.
    Fails if String can't be parsed as a number.

Builtins previously in SWI only

string_codes(?String, ?Codes)

    ECL could add this, but subsumed by string_list/3.

string_chars(?String, ?Chars)

    ECL could add this, but subsumed by string_list/3.

Builtins previously in ECL only (ignoring deprecated ones)

concat_strings(+String1, +String2, ?String3)

    Deterministic version of concatenation.

concat_string(++List, -Dest) [redundant]

    Succeeds if Dest is the concatenation of the atomic terms
    contained in List. 

join_string(++List, +Glue, -String)

    String is the string formed by concatenating the elements of List with
    an instance of Glue between each of them (subsumes concat_string/2).

split_string(+String, +SepChars, +PadChars, -SubStrings)

    Decompose String into SubStrings according to separators SepChars
    and padding characters PadChars. 

string_list(?String, ?List, +Format)

    Conversion between string in different encodings and a list
    (subsumes string_codes, string_chars, string_list).
    Format is bytes, codes, chars, utf8.

string_list(?String, ?List) [redundant]

    same as string_list(String,List,bytes)

substring(+String1, +String2, ?Position) [redundant]

    Quick semidet check for substring presence. 1-based position.

term_string(?Term, ?String)

    In the (+,-) direction, String is like the output of writeq.
    In the (?,+) direction, String is parsed with read.

sprintf(-String, +Format, ?ArgList)

    printf with output to string.

Other Suggestions

Other inspiration is to be found from Quintus's lib(string). in particular the span-family (but cf. split_string/4 above)

Suggestions by Richard O'Keefe

The following is based on as of 2013-12-27.


    with Quintus argument order, enabling omission of arguments
    for substring/2,3,4, and no underscore.


    like substring, but Sub is of type codes.


    like substring, but Sub is of type chars.

integer_string(Integer,String,Base,Zero) and /2,3

    taking an integer Base 2..36 and a character to be treated as zero.
    This allows alternative isomorphic 0-9 Unicode sequences to be used.

float_codes(Float,String,Format,Zero,Decimal,Exponent) and /3,4,5

    taking a format descriptor term, Zero character, Decimal point character,
    exponent marker character (characters all as string).

number_string(Number,String,Zero) and /2

    combination of the preceding two.


    well, string_concat/3 was chosen because of ISO.


    as above


    determines maximal leading sequence of characters out of the Set,
    followed by maximal sequence of characters in the Set.


    same from other end.

Tentative suggestion for


    skip characters in Set


    reads all available characters in Set up to a limit of Bound.
    Also reads any diacriticals that may follow the characters.
    Bound and Base are counting base characters only.
Edit - History - Print - Recent Changes - Search
Page last modified on July 11, 2014, at 02:27 PM