ImAppNIO-Wiki

This is an old revision of the document!

COST Action 15140: ImAppNIO - Working Group 3: Benchmarks - Open Discussion Wiki

Aims

Working Group 3 of the COST Action 15140: ImAppNIO is tasked with the development of practice-driven theory-affine benchmarks. Bringing together the perspectives which respectively dominate the efforts of WG1 and WG2 its task is to develop useful benchmark for nature-inspired search and optimisation heuristics with a strong focus on discrete search spaces and discrete optimisation problems making sure that the developed benchmarks are relevant from a practical perspective and accessible from a theoretical perspective.

It will pursue the same goals for discrete nature-inspired search and optimisation heuristics as the COmparing Continuous Optimisers (COCO) initiative for continuous optimisers. It will establish a workshop and the main international conferences in the field such as Genetic and Evolutionary Computation Conference - GECCO and International Conference on Parallel Problem Solving from Nature – PPSN. The focus will be on competitions based on these workshops. This will serve as a test bed for the results of WG1, a source of inspiration for WG2 and an important and highly visible focus point for anyone interest in the application of nature-inspired search and optimisation heuristics. The benchmarks that are developed will be made publicly available as part of the COST Action's Internet website and WG3 is responsible for providing and maintaining this.

Working Group 3 uses this Wiki to develop, discuss, and present contents in an open way.

PART A: Identifying the benchmark functions

Characteristics of the Sought Benchmark Problems

Following the spirit of COCO, the challenge is that the Benchmark functions should capture the difficulties of combinatorial problems in practice but at the same time be comprehensible such that algorithm behaviours can be understood or interpreted according to the performance on a given benchmark problem.

Ideally (not necessarily for all), we would like the benchmark functions to be:

1) scalable with the problem size;

2) to be non-trivial in the black box optimisation sense (the function may be shifted such that the global optimum may be any point).

While the challenge may be significant, especially for classical combinatorial optimisation problems (not so much for toy problems), achieving this goal would help greatly in bridging the gap between theoreticians and experimentalists.

Suggestions by Thomas Weise

I think there can be several approaches.

1. one could use some of the hardest instances of several classical problems and try to somehow standardize them.

2. one could try to develop new real-world problems (such as the Traveling Thief Problem by Markus Wagner).

3. one could try to develop a problem that can represent different hard aspects (such as NK or NKp landscapes).

Concerning Point 3, I have also done some work in this direction. Characterising what features make optimization problems hard and developing a model where these features can be tuned separately. I would suggest to include some settings of this function in the final test bed, as they are easy to understand, easy to implement, and fast to compute.

Suggestions by Markus Wagner

We in Adelaide have a bit of experience with interconnected problems, in particular, with an academic problem called the Travelling Thief Problem (TTP). This is a combination of two well-known combinatorial problems: the Knapsack Problem (KP) and the Travelling Salesperson Problem (TSP). The two are combined (among other) using a so-called rent rate R that needs to be paid for the time travelled - also, the speed reduces with increasing load, so packing maximally results in great slowdown and thus reduced overall objective score:

objective score = profit of items - R * travel time

Advantages regarding scalability and adjustability:

1. By changing R, we can shift the influence from one component to the other component: R=0 leaves us with the KP and R=large leaves us with the TSP.

2. If the tour is fixed and one is left with the knapsack problem, then Neumann and Polyakovskiy called this “Packing while Travelling” (PWT), for which a DP and a FPTAS exist. Thus, the search space is just the KP space.

3. There are 9720 instance online from with 51 to 85k cities and with 50 to 850k items (https://cs.adelaide.edu.au/~optlog/research/combinatorial.php).

4. The way the instances (including R) are defined gives us at least 1 solution with objective score 0 (by considering “optimal TSP tour” and “maximum KP profit”). There are no guarantees beyond this.

However, the overall setting is most likely sitting at the “quite complex” end of the spectrum of functions that the WG3 might be after.

A recent article with an overview of relevant approaches: https://cs.adelaide.edu.au/~markus/pub/2017joh-ttp.pdf (Springer: http://link.springer.com/article/10.1007%2Fs10732-017-9328-y)

PART B: Evaluating Algorithm performance

Apart from defining an initial set of benchmark functions appropriate ways to compare algorithm performance will have to be agreed upon.

Suggestions by Thomas Weise

Traditionally one would let the algorithms run for a fixed amount of time (which may be the number of fitness function evaluations) and then look at the end result. It is well understood that this may not be the best method. Instead, several measurements should be collected to document the progress of each run. In our work on the TSP Suite, we built a software along the lines of COCO, but for the TSP.

A “complete” benchmark, should also prescribe how the measurements should be stored (e.g., csv text file in a specific directory structure). If that can be done, than all researchers will immediately have comparable results. COCO/BBOB/TSP Suite do this implicitly by providing the experiment execution and evaluation as a single software package. As a result we avoid ending up with essentially incomparable experimental results.

APPENDIX: Wiki material of previous WG3 Leaders

Call for Benchmark Problems - How to Contribute

In our initial phase we are seeking suggestions for benchmark problems. You are invited to contribute by copying this Template Benchmark Problem page and linking it in the index below. Furthermore, feel free to augment or comment any ideas provided so far. Do not forget to subscribe to at least your own pages (under menu item ``manage subscriptions'') in order to receive email-notifications whenever colleagues augment, annotate, or change something! We hope that a fruitful discussion will emerge!