Re: [eclipse-clp-users] Performance-related issues in heuristic optimization with ECLiPSe

From: kot23 <kot23_at_otenet.gr>
Date: Tue, 17 Jun 2008 22:55:06 +0300
I beleive that scheduling is an NP-complete problem, and for such problems it should be obvious from the beginning that complete search is out of the question: it would take ages. What you need to do is find which (incomplete) search stragery is better suited for your problem. 

I just finished my dissertation (CLP for timetabling) and spent many hours testing the various heuristics available from ic:search/6. So I can assure you that your points (1,2) are correct: variable selection and value ordering are both important.

About your third point: I think that the ordering of constraints is not really important, at least in most situations (i.e. when you have not tampered with the ic_kernel itself, priorities, etc). During the search procedure the constraint propagation takes care of selecting which constraints are applicable at each state. The order they were posted is not relevant: they all go into a constraint pool. Perhaps you are using redundant constraints that increase the time needed by the solver to process them.

By all means you should use search/6 and leave labeling/1 for simpler tasks. In your case you would probably be needing smaller start times, so indomain_min might be a good choice for value ordering, as it starts by giving each variable the smaller value from its domain.

You are also correct about the bb_min/3, you might lose some solutions in favor of finding one faster. 

>From my own experience, adding a lot of precedence constraints really affects the time required for good solutions. It is a delicate balance between speed and quality of solution. I don't know if there are any rules of thumb here, but what I understood for sure is that the modeling of the problem is also very important. Try to keep the values of the variables quantitive: i.e. smaller (or larger) values would mean something to the problem. In this way you could use value ordering heuristics like indomain_min (or _max) and enjoy a really fast solution. 

For variable selection, I found "most_constrained" and "first_fail" to be the most efficient. They label in such a way that the search tree does not get very wide. This results in reaching a solution faster (fewer backtracks).

For value ordering, it really depends on the problem. indomain_min would be suitable in your case I beleive, since smaller values mean smaller start times, and better costs.

Finally for the search strategy: If your heuristics are suitable and efficient, good solutions will be found near the left-hand of the search tree. In this case, i would avoid credit-based search because its searching moves quickly towards the middle and right-hand side of the tree (i.e. away from the good solutions). bbs stays to the left-hand side, but who knows how many backtracks would be required for a good solution? So, what i used myself is a combination of dbs and lds: search is limited to the left-side (where the good solutions are found) and many alternatives are considered while searching for better solutions.

I might sound too general in my response but the topics you are concerned with are the major points of CLP! 
I am sure that other responses will follow, more detailed and elaborate, but I tried to give you some insight from my own experiences with ECLiPSe. 

Good Luck with it!
Christoforidis Nikos

On Tue, 17 Jun 2008 17:43:43 +0200
Soheil Samii <sohsa_at_ida.liu.se> wrote:

> Hi!
> 
> I have some general questions related to optimization with ECLiPSe (or CLP in 
> general). The problem that I try to solve with ECLiPSe is a task scheduling 
> problem for multiprocessor system with a cost function that is a sum of a 
> linear function in the optimization variables and the square-root of a 
> quadratic function.
> 
> The CLP code is basically organized as follows:
> * Declaring the domains of the variables (integer intervals).
> * Linear constraints (precedence constraints, etc.).
> * Resource constraints (using cumulative/4, which have three different 
> versions in the libraries ic_cumulative, ic_edge_finder, and 
> ic_edge_finder3).
> * Declaring the cost function 'Cost'.
> * Minimization using bb_min/3 in the library branch_and_bound, i.e., 
> bb_min(labeling(StartTimes), Cost, Options).
> 
> For a reasonably average-size problem (40 variables) it takes incredibly long 
> time to solve (after many hours it still hasn't found the optimum). I guess 
> that it is because I use complete search and because I use the labeling/1 
> predicate for the variable and value ordering. Is this true?
> 
> In my case, I want to use ECLiPSe as a heuristic to find a suboptimal 
> schedule, because the ECLiPSe solver will be used inside another optimization 
> loop. Therefore, I want the CLP solving process to be fast, and I have been 
> reading and thinking about different ways to use ECLiPSe as a heuristic. 
> These are the things that I have thought of:
> 
> 1. The variable ordering is important.
> 2. The value ordering is important (as we use _incomplete_ search).
> These two issues can be solved by using search/6 instead of labeling/1 above. 
> There you can choose different orderings and different types of incomplete 
> search.
> 
> 3. The order of the constraints is important.
> 
> 4. bb_min/3 has options that, for example, lets the branch-and-bound cut away 
> some solutions from the search tree. For example, there is an 
> option 'factor': if 'factor'=0.9 it means that when the search engine tries 
> to find a better solution than the current one, it tries to find a 10% better 
> solution; thus, the obtained solution is guaranteed to be at most 10% from 
> the optimum.
> 
> What are your experiences with heuristic optimization in ECLiPSe? Which of the 
> abovementioned points affect the runtime the most, and is any of them not 
> important for the runtime? Are there any other issues that I have forgot to 
> look at to obtain an efficient heuristic with ECLiPSe?
> 
> Regarding the cumulative/4 constraint: Is it better to use the version in 
> ic_cumulative or is it better to use the one in ic_edge_finder3? Which one is 
> best from the point of view of runtime?
> 
> Is it in general good for the performance to add constraints?
> 
> How much does the cost function affect the performance? I'm a little bit 
> worried that my cost function (which I described in the beginning of this 
> email) makes the runtime to grow, but I don't know how to check that.
> 
> Has anyone used ECLiPSe for heuristic optimization to obtain results within 
> fractions of a second, or is it totally unrealistic to achieve this?
> 
> Thankful for comments,
> 
> Regards,
> /Soheil
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://sourceforge.net/services/buy/index.php
> _______________________________________________
> ECLiPSe-CLP-Users mailing list
> ECLiPSe-CLP-Users_at_lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/eclipse-clp-users
> 


-- 
kot23_at_otenet.gr
Received on Tue Jun 17 2008 - 12:54:47 CEST

This archive was generated by hypermail 2.2.0 : Thu Feb 02 2012 - 02:31:58 CET