Re: [eclipse-clp-users] Is referenced_record/2 related to accessing asserted clauses?

From: Thorsten Winterer <thorsten_winterer_at_...126...> Date: Fri, 23 Nov 2012 11:39:40 +0100 · This archive was generated by hypermail 2.3.0 : Wed Sep 25 2024 - 15:13:20 CEST

Hi Edwin,

using dynamic clauses is very inefficient when you have a large number 
of them, since they are not indexed, unlike compiled clauses.
I would consider using hash tables or stores instead. Compare these two 
queries:

?- N = 100000,
    (for(I, 1, N) do J is I mod 100, assert(t(I, J))),
    (for(I, 1, N), fromto(0, In, Out, Sum) do t(I, J), Out is In + J).
N = 100000
I = I
J = J
In = In
Out = Out
Sum = 4950000
Yes (198.75s cpu)

?- N = 100000,
    store_create(SH),
    (for(I, 1, N), param(SH) do J is I mod 100, store_set(SH, I, J)),
    (for(I, 1, N), fromto(0, In, Out, Sum), param(SH) do store_get(SH, 
I, J), Out is In + J).
N = 100000
SH = 'STORE'(16'2c5fcc50)
I = I
J = J
In = In
Out = Out
Sum = 4950000
Yes (0.03s cpu)

The asserts in the first case take about 0.1 seconds. The rest is spent 
on retrieving the clauses again.

You can use your edgeN as key for the hash table (or store).

Cheers,
Thorsten

Am 23.11.2012 09:31, schrieb Edwin Wirawan:
> Hi,
>
> I tried profiling a run of my code and the result showed that the 
> majority of the time (~80%) is spent in referenced_record/2.
> My code asserts a lot of dynamic clauses at runtime and refers to 
> these dynamic clauses regularly.
> Does referenced_record/2 spend most of its runtime accessing asserted 
> clauses, or doing something else?
>
> FYI, each asserted clause is of the form: edge(edgeN, ...more 
> content...), where edgeN is unique for each asserted edge clause 
> (e.g., edge1, edge2,edge3,...).
> ~200K such clauses are asserted during a typical run.
>
> A slightly related question: I also called statistics/1. There were 
> 200K+ dictionary_entries, dict_hash_usage showed that the dictionary 
> was full (8192/8192) and the dict_has_collisions had a value of 
> (196801/8192).
> My guess is that the majority of 200K+ dictionary entries correspond 
> to our asserted 'edge' clauses, which leads me to wonder if the large 
> number of dictionary collisions means that the dictionary is using the 
> functor 'edge' or some other conflicted value as a key instead of my 
> unique edge ID values. Does that seem right? Would the dictionary hash 
> function have any difficulty distinguishing IDs that share a common 
> substring such as 'edge', and thus cause the seemingly high number of 
> collisions?
> Finally, assuming the time spent in referenced_record/2 is spent on 
> accessing asserted clauses, could that long time be somehow related to 
> the large number of collisions in the dictionary? Could it be that the 
> same keys are used for hashing both in the dictionary and the database 
> of asserted clauses? This would mean that the large number of 
> collisions in the dictionary would imply that there is also a large 
> number of collisions in the database of asserted clauses, thus slowing 
> down access to asserted clauses and resulting in referenced_record/2 
> taking a large chunk of the total time spent.
>
> I am also wondering whether there is a way to inspect the structure 
> and content of the dictionary and the database of asserted clauses, 
> and also whether there's a memory profiler for ECLiPSe.
>
> Thanks,
> Edwin
>
>
> ------------------------------------------------------------------------------
> Monitor your physical, virtual and cloud infrastructure from a single
> web console. Get in-depth insight into apps, servers, databases, vmware,
> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
> Pricing starts from $795 for 25 servers or applications!
> http://p.sf.net/sfu/zoho_dev2dev_nov
>
>
> _______________________________________________
> ECLiPSe-CLP-Users mailing list
> ECLiPSe-CLP-Users_at_lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/eclipse-clp-users