# 最適化問題に対する超高速＆安定計算

## クラスタ計算機やスーパーコンピュータ上での大規模最適化問題やグラフ探索などの研究のお話が中心

### TSUBAME 2.0 と SDPA, SDPARA　その６

１：SDPA 7.4.0 + GotoBLAS2 1.09

◯ GotoBLAS2 1.09
Makefile.rule で変更する箇所
TARGET = NEHALEM
CC = icc
FC = ifort
USE_OPENMP = 1
NO_CBLAS = 1
INTERFACE64 = 1
NO_WARMUP = 1

make clean; make
でライブラリの作成を行う
libgoto2.a は ${SDPA_HOME}/lib にコピーする（以下を参照) ◯ SDPA 7.4.0 export SDPA_HOME=$(HOME)/sdpa7.intel
export INTEL_LIB=/usr/apps/isv/intel/11.1.072/mkl/lib/em64t
export INTEL_LIB2=/usr/apps/isv/intel/11.1.072/lib/intel64
export CC=icc
export CXX=icpc
export F77=ifort
export CFLAGS="-O2 -openmp -DDIMACS_PRINT"
export CXXFLAGS="-O2 -openmp -DDIMACS_PRINT"
export FFLAGS="-O2 -openmp -i8"
./configure --with-blas="-L${SDPA_HOME}/lib -lgoto2 -L${INTEL_LIB} -liomp5" --with-lapack="-L${SDPA_HOME}/lib -lgoto2 -L${INTEL_LIB} -lpthread -lgoto2 -lstdc++ -lifcore -limf -lifport"
make

これで実行ができる（はず）。

２： SDPA 7.4.0 + Intel MKL 11.1.072

export INTEL_LIB=/usr/apps/isv/intel/11.1.072/mkl/lib/em64t
export INTEL_LIB2=/usr/apps/isv/intel/11.1.072/lib/intel64
export CC=icc
export CXX=icpc
export F77=ifort
export CFLAGS="-O2 -openmp"
export CXXFLAGS="-O2 -openmp"
export FFLAGS="-O2 -i8 -openmp"
./configure --with-blas="-L${INTEL_LIB} -L${INTEL_LIB2} -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5" --with-lapack="-L${INTEL_LIB} -L${INTEL_LIB2} -lmkl_lapack95_ilp64 -liomp5 -lpthread"

これで実行ができる（はず）。
コメント

### SDPA-DDとGPUとTSUBAME　その３

TSUBAME 2.0 上での SDPA-DD の make 方法について

1: QD ライブラリの dd_real.h と qd_config.h の変更
dd_real.h に以下を追加
#define isfinite(x) __builtin_isfinite(x)

diff qd_config.h ~/src/sdpa-dd-7.1.8/external/i/QD/include/qd/qd_config.h
< #define QD_ISFINITE(x) __extension__ ({ __typeof (x) __x = (x); __builtin_expect (!std::isnan(__x - __x), 1); })
---
> #define QD_ISFINITE(x) std::isfinite(x)

2: ./configure
make
を行う

3: make の最後で -L/opt/cuda/3.2/lib64 を追加する。

g++ -g -I/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/spooles/work/internal -DUseMETIS=0 -I/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/QD/include -I./mpack -g -O2 -D_REENTRANT -O2 -funroll-all-loops -o sdpa_dd sdpa_dd-sdpa_chordal.o sdpa_dd-sdpa_dataset.o sdpa_dd-sdpa_io.o sdpa_dd-sdpa_jordan.o sdpa_dd-sdpa_linear.o sdpa_dd-sdpa_main.o sdpa_dd-sdpa_newton.o sdpa_dd-sdpa_parts.o sdpa_dd-sdpa_struct.o sdpa_dd-sdpa_tool.o cuda/Rgemm.o sdpa_dd-Rpotrf.o sdpa_dd-Rsyrk.o sdpa_dd-Rtrsm.o sdpa_dd-Rdot.o sdpa_dd-Raxpy.o sdpa_dd-Rtrmm.o sdpa_dd-Rtrsv.o sdpa_dd-iMlaenv.o sdpa_dd-Rlamch.o sdpa_dd-Rlascl.o sdpa_dd-Rsytrd.o sdpa_dd-Rsterf.o sdpa_dd-Rorgtr.o sdpa_dd-Rlatrd.o sdpa_dd-Rsyr2k.o sdpa_dd-Rsytd2.o sdpa_dd-Rlanst.o sdpa_dd-Rlae2.o sdpa_dd-Rlapy2.o sdpa_dd-Rlasrt.o sdpa_dd-Rorgql.o sdpa_dd-Rorgqr.o sdpa_dd-Rsymv.o sdpa_dd-Rlarfg.o sdpa_dd-Rsyr2.o sdpa_dd-Rlassq.o sdpa_dd-Rorg2l.o sdpa_dd-Rlarft.o sdpa_dd-Rlarfb.o sdpa_dd-Rorg2r.o sdpa_dd-Rnrm2.o sdpa_dd-Rlarf.o sdpa_dd-Rger.o sdpa_dd-Mxerbla.o sdpa_dd-Rpotf2.o sdpa_dd-Mlsame.o sdpa_dd-Rscal.o sdpa_dd-Rcopy.o sdpa_dd-Rgemv.o sdpa_dd-Rtrmv.o sdpa_dd-Rsteqr.o sdpa_dd-Rlaset.o sdpa_dd-Rlaev2.o sdpa_dd-Rlasr.o sdpa_dd-Rlartg.o sdpa_dd-Rswap.o sdpa_dd-Rsyev.o sdpa_dd-Rlansy.o sdpa_dd-Mutils.o -lpthread -L/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/SPOOLES/lib -lspooles -L/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/QD/lib -lqd -L/opt/cuda/3.2/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcudart -lm -lpthread

4: 以下のように実行ができる（はず）。

./sdpa_dd ~/src/sdplib/theta2.dat-s out
SDPA-DD start at Mon Jan 30 01:22:07 2012
data is /home/usr1/11ITA100/src/sdplib/theta2.dat-s : sparse
parameter is ./param.sdpa
out is out

DENSE computations
0 1.0e+04 1.0e+00 1.0e+00 -0.00e+00 +1.00e+04 1.0e+00 8.7e-01 2.00e-01
1 1.8e+03 0.0e+00 1.3e-01 +1.42e+02 +5.90e+03 7.6e-01 7.6e-01 2.00e-01
2 5.6e+02 1.2e-34 3.1e-02 +1.82e+02 +5.95e+02 8.1e-01 8.1e-01 2.00e-01
3 1.4e+02 1.8e-34 5.7e-03 +2.39e+02 +5.96e+01 9.1e-01 9.1e-01 2.00e-01
4 1.9e+01 2.1e-34 4.9e-04 +3.19e+02 +6.19e+00 1.0e+00 1.0e+00 2.00e-01
5 3.4e+00 2.7e-34 4.7e-36 +3.37e+02 +1.35e+00 7.6e-01 1.3e+01 1.00e-01
6 9.3e-01 2.6e-34 2.2e-35 +1.07e+02 +1.38e+01 7.9e-01 1.4e+00 1.00e-01
7 2.3e-01 2.0e-33 3.3e-35 +4.56e+01 +2.30e+01 7.6e-01 8.3e-01 1.00e-01
8 6.7e-02 3.9e-33 4.2e-36 +3.56e+01 +2.89e+01 8.1e-01 7.9e-01 1.00e-01
9 1.9e-02 3.9e-33 1.2e-35 +3.35e+01 +3.16e+01 8.5e-01 8.6e-01 1.00e-01
10 4.2e-03 3.9e-33 6.0e-36 +3.30e+01 +3.26e+01 9.0e-01 9.1e-01 1.00e-01
11 7.8e-04 3.9e-33 2.1e-35 +3.29e+01 +3.28e+01 9.4e-01 9.3e-01 1.00e-01
12 1.3e-04 3.9e-33 4.9e-36 +3.29e+01 +3.29e+01 9.6e-01 9.4e-01 1.00e-01
13 1.9e-05 7.8e-33 9.7e-36 +3.29e+01 +3.29e+01 9.9e-01 9.7e-01 1.00e-01
14 2.3e-06 7.8e-33 4.1e-35 +3.29e+01 +3.29e+01 1.0e+00 9.4e-01 1.00e-01
15 3.4e-07 7.8e-33 3.7e-35 +3.29e+01 +3.29e+01 1.1e+00 9.6e-01 1.00e-01
16 4.4e-08 7.8e-33 1.1e-34 +3.29e+01 +3.29e+01 1.1e+00 9.8e-01 1.00e-01
17 4.8e-09 7.8e-33 8.3e-36 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
18 4.9e-10 7.8e-33 1.3e-34 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
19 4.9e-11 7.8e-33 3.3e-33 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
20 4.9e-12 7.8e-33 3.7e-34 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01

phase.value = pdOPT
Iteration = 20
mu = 4.9185714155822500e-12
relative gap = 1.4959536882616738e-11
gap = 4.9185714155822492e-10
digits = 1.0825081851153193e+01
objValPrimal = 3.28791690158170027189966087390464e+01
objValDual = 3.28791690153251455774383837785678e+01
p.feas.error = 7.8886090522101181e-31
d.feas.error = 3.7191633156061912e-30
relative eps = 4.9303806576313200e-32
total time = 6.480
main loop time = 6.460000
total time = 6.480000
コメント (2)

### シンポジウム「数理最適化の40年と今後の展開」

シンポジウム「数理最適化の40年と今後の展開」

シンポジウム終了後に懇親会 (18:00-20:00) を予定しております。

プログラム
10:30-15:30

Martin Mevissen (IBM research, Dublin) Semidefinite Programming Relaxations for Nonconvex Problems
16:00-17:00

「研究生活４０年を超えて」
18:00-20:00

コメント

### TSUBAME 2.0 と SDPA, SDPARA　その５

◯OPTクラスタ
39 1.7e-09 6.2e-10 1.3e-11 +7.37e-01 +7.37e-01 1.3e-01 7.9e-02 1.00e-01
40 1.6e-09 6.2e-10 1.2e-11 +7.37e-01 +7.37e-01 1.3e-01 7.9e-02 1.00e-01

phase.value = pdOPT
Iteration = 40
mu = +1.5865462569448133e-09
relative gap = +5.5511151231257827e-16
gap = -5.5511151231257827e-16
digits = +1.5123174708309257e+01
objValPrimal = +7.3714842584791418e-01
objValDual = +7.3714842584791473e-01
p.feas.error = +6.2254989472350633e-08
d.feas.error = +1.1738616190193341e-09
total time = 6.396788

◯TSUBAME 2.0
29 4.4e-10 5.2e-10 5.8e-13 +7.37e-01 +7.37e-01 1.9e-01 2.5e-01 1.00e-01
30 3.5e-10 5.2e-10 4.4e-13 +7.37e-01 +7.37e-01 1.9e-01 2.5e-01 1.00e-01

phase.value = pdOPT
Iteration = 30
mu = +3.4630401716388027e-10
relative gap = +0.0000000000000000e+00
gap = +0.0000000000000000e+00
digits = +inf
objValPrimal = +7.3714895105620593e-01
objValDual = +7.3714895105620593e-01
p.feas.error = +5.2877355561715889e-08
d.feas.error = +4.3517454961163469e-11
total time = 7.379908

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

○ OPT クラスタ
１：PowerEdge M1000e(ブレードエンクロージャー) x 1台
２：PowerEdge M710HD(ブレードサーバ) x 16台
ブレードサーバの仕様：
CPU : インテル(R) Xeon(R) プロセッサー X5670(2.93GHz、12MB キャッシュ、6.4 GT/s QPI) x 2個
メモリ： 128GB (16X8GB/2R/1333MHz/DDR3 RDIMM/CPUx2)
Disk : 73GB x 2(1台のみ 300GB x 2)
NIC : GbE x 1 & Inifiniband QDR(40Gbps) x 1
コメント

### 研究室とサーバ室の分離

よって騒音で学生に迷惑がかかりますので、休日や夜間しか実験が出来ません。

というわけで（本当の理由は CREST ですが）、現研究室を実質サーバ室として利用して、人間達は別の部屋に引っ越すことにしました。サーバ室ではもう遠慮の必要がなくなったので、今後、さらにサーバ達が増えていきます。

コメント

### 第１回 Graph CREST シンポジウム：情報更新５

これまでの情報をまとめました。

-----------------------------------------------------------------------------------------------------------------------------------------------------------

MIP, LP, SDP などの数理計画問題（ソフトウェアベンチマークを含む）及び Graph500 や QAP 等の大規模最適化問題に対する最新の研究結果に関する講演を行います。
また、午後には LP と MIP に関するチュートリアルセッションも企画しています。

◯9:45 - 10:00 Katsuki Fujisawa (Chuo University)
Title: Introduction : CREST project and Workshops on Computational Aspects for Solving Large Scale Optimization Problems

◯10:00 - 10:30 Toyotaro Suzumura (Tokyo Institute of Technology)
Title: Performance Evaluation of Graph500 on Large-Scale Distributed Environment

◯10:30 - 11:00 Hans Mittelmann (Arizona State University, USA)
Title: Benchmarks for Optimization Software

Abstract: Under the title "Benchmarks for Optimization Software" we maintain extensive
evaluations of a large selection of optimization software. In this talk we
will report on the benchmarks in discrete optimization including those
based on MIPLIB 2010 as well as selected ones in continuous optimization.

[1] Benchmarks for Optimization Software, http://plato.asu.edu/bench.html
[2] MIPLIB 2010, http://miplib.zib.de, Math Prog Comp 3, 103-163 (2011)

◯11:00 - 11:15 Break

◯11:15 - 11:45 Toh Kim Chuan (National University of Singapore)
Title: Computational experience in solving large scale semidefinite programming.

Abstract: In this talk, we will report our computational experience
in solving large scale semidefinite programming based on two classes of methods.
The first class consists of inexact primal-dual interior-point methods for which
the linear system in each iteration is solved by a preconditioned Krylov subspace method.
The second class are nonlinear programming based
methods such as proximal-point method, alternating direction method of multiplier,
We will discuss the merits of the two classes of methods and present some
computational results.

◯11:45 - 12:15 Yasuaki Matsukawa and *Akiko Yoshise (University of Tsukuba)
Title: A Primal Barrier Function Phase I Algorithm for Nonsymmetric Conic Optimization Problems

Abstract: We call a positive semidefinite matrix whose elements are nonnegative a
{\em doubly nonnegative matrix}, and the set of those matrices the {\em
doubly nonnegative cone} (DNN cone).
The DNN cone is not symmetric but can be represented as the projection of
a symmetric cone embedded in a higher dimension.
In a previous paper, the authors demonstrated the efficiency of the DNN
relaxation using the symmetric cone representation of the DNN cone.
They showed that the DNN relaxation gives significantly tight bounds for a
class of quadratic assignment problems, but the computational time is not
affordable as long as we employ the symmetric cone representation.
They then suggested a primal barrier function approach for solving the DNN
optimization problem directly, instead of using the symmetric cone
representation.
However, most of existing studies on the primal barrier function approach
assume the availability of a feasible interior point.
This fact means that those studies are not inextricably tied to the
practical usage.
Motivated by these observations, we propose a primal barrier function
Phase I algorithm for solving conic optimization problems over the closed
convex cone $K$ having the following properties:
(a) $K$ is not necessarily symmetric, (b) a self-concordant function $f$
is defined over $\inter K$, and (c) its dual cone $K^*$ is not explicit or
is intractable, all of which are observed when $K$ is the DNN cone.
We analyze the algorithm and provide a sufficient condition for finite
termination.

◯12:15 - 13:30 Lunch

◯13:30 - 15:00 Tutorial Session 1
◎Timo Berthold (Zuse Institute Berlin, Germany)
Title: What is Linear and Mixed Integer Programming(LP/MIP)

Abstract: Linear programming (LP) means the optimization of a linear function
subject to a set of linear constraints. In mixed-integer programming
(MIP), additionally, some of the variables are required to take integer
values. Linear and mixed-integer programming are two of the most essential
techniques in theory and practice of mathematical optimization.
Nowadays, linear programs with hundreds of thousands of variables and
constraints can be solved efficiently. Although this is not generally true
for mixed-integer programming - which is NP-hard -, state-of-the-art
software often is able to solve large, practically relevant problems.
This talk is supposed to be an introduction to the computational aspects
of LP and MIP. We present three commonly used algorithms to solve
large-scale, practically relevant problems of these types: the simplex algorithm for
LPs, the general cutting plane method and the branch-and-bound algorithm
for IPs. We discuss some of their pitfalls and ruses and showcase a few
algorithmic enhancements the today's MIP solvers are equipped with.

◎Ambros Gleixner and Stefan Heinz (Zuse Institute Berlin, Germany)
Title: First steps with Zimpl and SCIP

Abstract: In this tutorial, we demonstrate the usage of the modeling language Zimpl
and the mixed integer programming solver SCIP. We show how to model two
different formulations for the well-known binpacking problem and discuss
their limitations. Feeding them into SCIP we compare their computational
performance within a standalone branch-and-cut solver.
Zimpl is an algebraic modeling language featuring exact arithmetic. SCIP is
branch-cut-and-price framework targeted towards the need of researchers. It
allows total control of the solution process and the access of detailed
information down to the guts of the solver. Both tools are part of the ZIB
Optimization Suite (http://zibopt.zib.de), which is free for academic use
and available in source code.

◎Stefan Heinz (Zuse Institute Berlin, Germany)
Title: Using SCIP as a branch-and-price framework

Abstract: Column generation is a technique to handle large-scale linear programs
efficiently. In practice, it is widely used to solve real world problems
within a branch-and-price approach. Examples are rolling stock roster
planning, duty scheduling, and vehicle scheduling.
In this talk we are focusing on the branch-and-price feature of SCIP. We
illustrate using the steel mill slab problem, how the framework SCIP is
easily customized for all needs of branch-and-price approach.
We first present the basic idea of column generation and
branch-and-price. Second, we show step-by-step, how the framework SCIP can
be applied as a branch-and-price framework. Finally, we discuss some
pitfalls of the branch-and-price method which are easily avoidable within
the SCIP framework.

◯15:30 - 15:45 Break

◯15:45 - 16:45 Tutorial Session 2
◎ Ambros Gleixner (Zuse Institute Berlin, Germany)
Title: Improving the accuracy of LP solvers

Abstract: We describe an iterative refinement procedure for computing extended precision or exact solutions
to linear programming problems (LPs). Arbitrarily precise solutions can be computed by solving a
sequence of closely related LPs with limited precision arithmetic. The LPs solved at iterations of
this algorithm share the same constraint matrix as the original problem instance and are transformed
only by modification of the objective function, right-hand side, and variable bounds.
Exact computation is used to compute and store the exact representation of the transformed
problems, while numeric computation is used for computing approximate LP solutions and applying
iterations of the simplex algorithm. At all steps of the algorithm the LP bases encountered in the
transformed problems correspond directly to LP bases in the original problem description.
We demonstrate that this algorithm is effective in practice for computing extended precision
solutions and that this leads to direct improvement of the best known methods for solving LPs
exactly over the rational numbers. A proof-of-concept implementation is done within the SoPlex LP
solver.

◎ Timo Berthold (Zuse Institute Berlin, Germany)
Title: Primal Heuristics for MIP

Abstract: In modern MIP-solvers like the state-of-the-art
branch-cut-and-price-framework SCIP, primal heuristics play a major role
in finding and improving feasible solutions at the early steps of the
solution process.
Primal heuristics in SCIP can be categorized in three groups:
* rounding and propagation heuristics
* diving and objective diving heuristics
* large neighborhood search heuristics
We give examples for each of these classes, concentrating on recently
propsed algorithms. Further, we discuss the question of how the quality of
a primal heuristic should be evaluated and introduce a new a new
performance measure, the "primal integral". It assess the impact of these
primal heuristics on the ability to find feasible solutions, in
particular early during search. Finally, we discuss some computational results.

◯16:45 - 17:00 Break

◯17:00 - 17:30 Domenico Salvagnin (University of Padova, Italy)
Title: Three ideas for the Quadratic Assignment Problem

Abstract: We address the exact solution of the famous esc instances of the quadratic assignment problem. These
are extremely hard instances that remained unsolved―even allowing for a tremendous computing
power―by using all previous techniques from the literature. During this challenging task we found
that three ideas were particularly useful, and qualified as a breakthrough for our approach. The
present talk is about describing these ideas and their impact in solving esc instances. Our method
was able to solve, in a matter of seconds or minutes on a single PC, all easy cases (all esc16* plus
esc32e and esc32g). The three very hard instances esc32c, esc32d and esc64a were solved in less than
half an hour, in total, on a single PC. We also report the solution, in about 5 hours, of tai64c. By
using a facility-flow splitting procedure, we were also able to solve to proven optimality, for the
first time, esc32h (in about 2 hours) as well as “the big fish” esc128 (to our great surprise, the
solution of the latter required just a few seconds on a single PC).

◯17:30 - 18:00 *Shunji Umetani (Osaka University), Masanao Arakawa (Fujitsu Limited) and Mutsunori Yagiura (Nagoya University)
Title: A heuristic algorithm for the set multicover problem with generalized upper bound constraints

Abstract: The set covering problem (SCP) is one of representative combinatorial
optimization problem, which has many practical applications, e.g.,
crew scheduling, vehicle routing, facility location and data analysis.
In this talk, we consider an extension of SCP introducing (i)
multicover and (ii) generalized upper bound (GUB) constraints, which
substantially extend the variety of its applications.
For the conventional SCP, it has been known that relaxed problems
give us a good device called the pricing method to reduce the number of
variables, and several efficient heuristic algorithms utilizing this
idea have been developed to solve very large-scale instances with up
to 5000 constraints and 1,000,000 variables.
However, GUB constraints often make the standard pricing method less
effective, because they prevent solutions from having highly evaluated
variables simultaneously.
To overcome this, we develop a hybrid approach of metaheuristics and
the pricing method, in which we propose an evaluation scheme of
variables based on penalty weights that are adaptively controlled
during the search of metaheuristic algorithm.
Another feature of our algorithm is an efficient implementation of
local search with the 2-flip neighborhood.
According to computational comparison on benchmark instances with the
latest MIP solvers, our algorithm performs quite efficiently for
various types of problem instances, especially for very large-scale
instances.
コメント

### 第１回 Graph CREST シンポジウム：情報更新４

Title: Benchmarks for Optimization Software
H.D. Mittelmann
School of Mathematical and Statistical Sciences
Arizona State University, USA

Abstract:
Under the title "Benchmarks for Optimization Software" we maintain extensive
evaluations of a large selection of optimization software. In this talk we
will report on the benchmarks in discrete optimization including those
based on MIPLIB 2010 as well as selected ones in continuous optimization.

[1] Benchmarks for Optimization Software, http://plato.asu.edu/bench.html
[2] MIPLIB 2010, http://miplib.zib.de, Math Prog Comp 3, 103-163 (2011)

Title: What is Linear and Mixed Integer Programming(LP/MIP)
Timo Berthold
Zuse Institute Berlin, Germany

Abstract:
Linear programming (LP) means the optimization of a linear function
subject to a set of linear constraints. In mixed-integer programming
(MIP), additionally, some of the variables are required to take integer
values. Linear and mixed-integer programming are two of the most essential
techniques in theory and practice of mathematical optimization.
Nowadays, linear programs with hundreds of thousands of variables and
constraints can be solved efficiently. Although this is not generally true
for mixed-integer programming - which is NP-hard -, state-of-the-art
software often is able to solve large, practically relevant problems.
This talk is supposed to be an introduction to the computational aspects
of LP and MIP. We present three commonly used algorithms to solve
large-scale, practically relevant problems of these types: the simplex algorithm for
LPs, the general cutting plane method and the branch-and-bound algorithm
for IPs. We discuss some of their pitfalls and ruses and showcase a few
algorithmic enhancements the today's MIP solvers are equipped with.

Title: First steps with Zimpl and SCIP
Ambros Gleixner, Stefan Heinz
Zuse Institute Berlin, Germany

Abstract:
In this tutorial, we demonstrate the usage of the modeling language Zimpl
and the mixed integer programming solver SCIP. We show how to model two
different formulations for the well-known binpacking problem and discuss
their limitations. Feeding them into SCIP we compare their computational
performance within a standalone branch-and-cut solver.
Zimpl is an algebraic modeling language featuring exact arithmetic. SCIP is
branch-cut-and-price framework targeted towards the need of researchers. It
allows total control of the solution process and the access of detailed
information down to the guts of the solver. Both tools are part of the ZIB
Optimization Suite (http://zibopt.zib.de), which is free for academic use
and available in source code.

Title: Using SCIP as a branch-and-price framework
Stefan Heinz
Zuse Institute Berlin, Germany

Abstract:
Column generation is a technique to handle large-scale linear programs
efficiently. In practice, it is widely used to solve real world problems
within a branch-and-price approach. Examples are rolling stock roster
planning, duty scheduling, and vehicle scheduling.
In this talk we are focusing on the branch-and-price feature of SCIP. We
illustrate using the steel mill slab problem, how the framework SCIP is
easily customized for all needs of branch-and-price approach.
We first present the basic idea of column generation and
branch-and-price. Second, we show step-by-step, how the framework SCIP can
be applied as a branch-and-price framework. Finally, we discuss some
pitfalls of the branch-and-price method which are easily avoidable within
the SCIP framework.

Title: Improving the accuracy of LP solvers
Ambros Gleixner
Zuse Institute Berlin, Germany

Abstract:
We describe an iterative refinement procedure for computing extended precision or exact solutions
to linear programming problems (LPs). Arbitrarily precise solutions can be computed by solving a
sequence of closely related LPs with limited precision arithmetic. The LPs solved at iterations of
this algorithm share the same constraint matrix as the original problem instance and are transformed
only by modification of the objective function, right-hand side, and variable bounds.
Exact computation is used to compute and store the exact representation of the transformed
problems, while numeric computation is used for computing approximate LP solutions and applying
iterations of the simplex algorithm. At all steps of the algorithm the LP bases encountered in the
transformed problems correspond directly to LP bases in the original problem description.
We demonstrate that this algorithm is effective in practice for computing extended precision
solutions and that this leads to direct improvement of the best known methods for solving LPs
exactly over the rational numbers. A proof-of-concept implementation is done within the SoPlex LP
solver.

Title: Primal Heuristics for MIP
Timo Berthold
Zuse Institute Berlin, Germany

Abstract:
In modern MIP-solvers like the state-of-the-art
branch-cut-and-price-framework SCIP, primal heuristics play a major role
in finding and improving feasible solutions at the early steps of the
solution process.
Primal heuristics in SCIP can be categorized in three groups:
* rounding and propagation heuristics
* diving and objective diving heuristics
* large neighborhood search heuristics
We give examples for each of these classes, concentrating on recently
propsed algorithms. Further, we discuss the question of how the quality of
a primal heuristic should be evaluated and introduce a new a new
performance measure, the "primal integral". It assess the impact of these
primal heuristics on the ability to find feasible solutions, in
particular early during search. Finally, we discuss some computational results.

Title: Three ideas for the Quadratic Assignment Problem
Domenico Salvagnin

Abstract:
We address the exact solution of the famous esc instances of the quadratic assignment problem. These
are extremely hard instances that remained unsolved―even allowing for a tremendous computing
power―by using all previous techniques from the literature. During this challenging task we found
that three ideas were particularly useful, and qualified as a breakthrough for our approach. The
present talk is about describing these ideas and their impact in solving esc instances. Our method
was able to solve, in a matter of seconds or minutes on a single PC, all easy cases (all esc16* plus
esc32e and esc32g). The three very hard instances esc32c, esc32d and esc64a were solved in less than
half an hour, in total, on a single PC. We also report the solution, in about 5 hours, of tai64c. By
using a facility-flow splitting procedure, we were also able to solve to proven optimality, for the
first time, esc32h (in about 2 hours) as well as “the big fish” esc128 (to our great surprise, the
solution of the latter required just a few seconds on a single PC).

コメント

### TSUBAME 2.0 と SDPA, SDPARA　その４

TSUBAME 2.0 のキューが激込みですので以下の問題を解くための SDPARA の実行もいつになるのかわからない状況です。昨日報告しましたように新しい ScaLAPACK 2.0.1 と SDPARA 7.3.3 の組み合わせの動作確認も行いましたが、ScaLAPACK 1.8.0 と性能差が無さそうですので、これまでの動作実績から SDPARA 7.3.3 + ScaLAPACK 1.8.0 の組み合わせを TSUBAME 2.0 のキューに投入します。

これが解ければ現時点での SDP 世界記録の更新となる。
◯QAPLIB (tai30a.dat-s) の DNN 緩和問題
mDIM = 379350
nBLOCK = 2
bLOCKsTRUCT = -485758 842

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF
コメント

### ScaLAPACK 2.0.1 と SDPARA

ScaLAPACK の 2.0.1 が公開されている。

http://www.netlib.org/scalapack/

◯問題 nug14_r2.dat-s
SDPARA 7.3.3 + ScaLAPACK 1.8.0 = 133.05秒
SDPARA 7.3.3 + ScaLAPACK 2.1.0 = 131.42秒

◯問題 Be.1S.SV.pqgt1t2p.dat-s
SDPARA 7.3.3 + ScaLAPACK 1.8.0 = 171.96秒
SDPARA 7.3.3 + ScaLAPACK 2.1.0 = 173.22秒

○ OPT クラスタ
１：PowerEdge M1000e(ブレードエンクロージャー) x 1台
２：PowerEdge M710HD(ブレードサーバ) x 16台
ブレードサーバの仕様：
CPU : インテル(R) Xeon(R) プロセッサー X5670(2.93GHz、12MB キャッシュ、6.4 GT/s QPI) x 2個
メモリ： 128GB (16X8GB/2R/1333MHz/DDR3 RDIMM/CPUx2)
Disk : 73GB x 2(1台のみ 300GB x 2)
NIC : GbE x 1 & Inifiniband QDR(40Gbps) x 1
OS : CentOS 5.7 for x86_64
コメント

### ScaLAPACK 2.0.1

ScaLAPACK の 2.0.1 が公開されている。ScaLAPACK では 2.0.0 以降では BLACS も含まれている

http://www.netlib.org/scalapack/

ちなみに SDPARA では ScaLAPACK 1.8.0 と 2.0.1 でも性能差は見られない。
コメント

### TSUBAME 2.0 上での SDPARA：Cholesky

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF

ここでは現在の Cholesky (CPU+GPU) での性能を元に以下の超巨大 SDP を解くと仮定して、性能見積りを行ってみる。詳細については以前の記事を参照

◯予想：ノード数 420, Cholesky 分解(CPU + GPU) 611GFlops / ノード
420 * 611 * 0.9 = 230958 GFLOPS なので、6.0291e+17 / 2.30958e+14 * 40 = 約 104,420 秒 = 約 29 時間
コメント

### 第１回 Graph CREST シンポジウム：情報更新３

9:45 - 10:00 Katsuki Fujisawa
Introduction :
CREST project and Workshops on Computational Aspects for Solving Large Scale Optimization Problems
10:00 - 10:30 Toyotaro Suzumura
Tentative titel:
Performance Evaluation of Graph500 on Large-Scale Distributed Environment
10:30 - 11:00 Hans Mittelmann
Tentative title: Benchmarks for Optimization Software
11:00 - 11:15 Break
11:15 - 11:45 TOH Kim Chuan
Computational experience in solving large scale semidefinite programmin
11:45 - 12:15 Yasuaki Matsukawa
A Primal Barrier Function Phase I Algorithm for Nonsymmetric Conic Optimization Problems
12:15 - 13:30 Lunch
13:30 - 15:00 Tutorial Session 1
- Timo Berthold
What is Linear and Mixed Integer Programming(LP/MIP)
- Ambros Gleixner, Stefan Heinz
How to model and solve LPs and MIPs (SCIP, ZIMPL)
- Stefan Heinz
Tentative title: Column Generation using SCIP
15:30 - 15:45 Break
15:45 - 16:45 Tutorial Session 2
- Ambros Gleixner
Improving the Accuracy of LP Solvers
- Timo Berthold
Primal Heuristics for MIP
16:45 - 17:00 Break
17:00 - 17:30 Domenico Salvagnin
Three ideas for the Quadratic Assignment Problem
17:30 - 18:00 Shunji Umetani
A heuristic algorithm for the set multicover problem with
generalized upper bound constraints
コメント

### CREST研究 ： SDPとMIP

JST CREST : ポストペタスケールシステムにおける超大規模グラフ最適化基盤の来年度の重点研究対象について

-------------------------------------------------------------------------------------------------------------------------------------------------------------------

１：半正定値計画問題(SDP)

２：混合整数計画問題(MIP)

コメント

### TSUBAME 2.0 と SDPA, SDPARA　その４

TSUBAME上で SDPARA を作成する場合では、OpenMPI と MVAPICH の両者どちらでも用いることができる。以下のように環境変数を設定することで両者の MPI を切り替えることができる。性能的にはともかく、超大規模問題を解くときには MVAPICH を用いないと大変なことになる。

OpenMPI1.4.2+intel
export PATH=/usr/apps/openmpi/1.4.2/intel/bin:$PATH export LD_LIBRARY_PATH=/usr/apps/openmpi/1.4.2/intel/lib:$LD_LIBRARY_PATH

MVAPICH1.5.1+intel
export PATH=/usr/apps/mvapich2/1.5.1/intel/bin:$PATH export LD_LIBRARY_PATH=/usr/apps/mvapich2/1.5.1/intel/lib:$LD_LIBRARY_PATH
コメント

### TSUBAME 2.0 と SDPA, SDPARA　その３

そろそろ TSUBAME 2.0 に SDPARA のジョブを投入して超大規模問題の最適解を狙います。

◯QAPLIB (tai30a.dat-s) の DNN 緩和問題
mDIM = 379350
nBLOCK = 2
bLOCKsTRUCT = -485758 842

この問題では N = 379,350 なので、Cholesky 分解の FLOPS 値は = 1.8197e+16 となります。TSUBAME 2.0 300 ノード(CPU のみ)で Cholesky 分解の性能効率 40%, SDP は 40 反復必要と仮定しますと、300 * 140GF * 0.40 = 1.6800e+13 FLOPS ですので、1.8197e+16 / 1.6800e+13 * 40 = 43,326秒 = 約１２時間となります。

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF
コメント