[eclipse-clp-users] UTF8 support for String

From: Edgaonkar, Shrirang <Shrirang.Edgaonkar_at_nttdata.com>
Date: Fri, 3 Jul 2015 03:44:44 +0000
Dear CLP users,



   The following predicate returns the Length variable as 12  since the unicode character length is counted as 3 instead of 1. Since there are 3 characters it gets 9 plus 3 Ascii characters equals 12.

string_length("ABC$B%?!<%G(B", Length),

Whereas the following clauses would return N as 6 for the same string since it supports utf8.





string_list("ABC$B%?!<%G(B", List, utf8),

length(List, N),



I have written a list of predicates for string manipulation. They use the existing predicates from library Strings and Atoms like append_strings(?String1, ?String2, ?String3) etc. If I have to support utf8 such that string_length("ABC$B%?!<%G(B", Length, utf8), gives me 6, I have to write my own version for example:-





string_length(STR, Length, utf8):-

string_list(STR, List, utf8),

length(List, Length).



This is just a prototype for illustration. Please let me know if my understanding is right. Replacing all the Strings and Atoms with utf8 support is a task for me given they from sepia-kernel.



Thanks and Regards,

Shrirang Edgaonkar



______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
Received on Fri Jul 03 2015 - 04:04:22 CEST

This archive was generated by hypermail 2.2.0 : Sun Jul 05 2015 - 18:13:26 CEST