2012年1月のブログ記事一覧-最適化問題に対する超高速＆安定計算

TSUBAME 2.0 と SDPA, SDPARA　その６

2012年01月31日 01時22分08秒 | Weblog

東工大のスパコン TSUBAME 2.0 上で SDPA の最新版 7.4.0 が make できないという質問が多いので、以下に make 方法を詳しく掲載します。

１：SDPA 7.4.0 + GotoBLAS2 1.09

◯ GotoBLAS2 1.09
Makefile.rule で変更する箇所
TARGET = NEHALEM
CC = icc
FC = ifort
USE_OPENMP = 1
NO_CBLAS = 1
INTERFACE64 = 1
NO_WARMUP = 1

make clean; make
でライブラリの作成を行う
libgoto2.a は ${SDPA_HOME}/lib にコピーする（以下を参照)

◯ SDPA 7.4.0
export SDPA_HOME=$(HOME)/sdpa7.intel
export INTEL_LIB=/usr/apps/isv/intel/11.1.072/mkl/lib/em64t
export INTEL_LIB2=/usr/apps/isv/intel/11.1.072/lib/intel64
export CC=icc
export CXX=icpc
export F77=ifort
export CFLAGS="-O2 -openmp -DDIMACS_PRINT"
export CXXFLAGS="-O2 -openmp -DDIMACS_PRINT"
export FFLAGS="-O2 -openmp -i8"
./configure --with-blas="-L${SDPA_HOME}/lib -lgoto2 -L${INTEL_LIB} -liomp5" --with-lapack="-L${SDPA_HOME}/lib -lgoto2 -L${INTEL_LIB} -lpthread -lgoto2 -lstdc++ -lifcore -limf -lifport"
make

これで実行ができる（はず）。

２： SDPA 7.4.0 + Intel MKL 11.1.072

export INTEL_LIB=/usr/apps/isv/intel/11.1.072/mkl/lib/em64t
export INTEL_LIB2=/usr/apps/isv/intel/11.1.072/lib/intel64
export CC=icc
export CXX=icpc
export F77=ifort
export CFLAGS="-O2 -openmp"
export CXXFLAGS="-O2 -openmp"
export FFLAGS="-O2 -i8 -openmp"
./configure --with-blas="-L${INTEL_LIB} -L${INTEL_LIB2} -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5" --with-lapack="-L${INTEL_LIB} -L${INTEL_LIB2} -lmkl_lapack95_ilp64 -liomp5 -lpthread"

これで実行ができる（はず）。

SDPA-DDとGPUとTSUBAME　その３

2012年01月30日 01時22分49秒 | Weblog

TSUBAME 2.0 上での SDPA-DD の make 方法について

1: QD ライブラリの dd_real.h と qd_config.h の変更
dd_real.h に以下を追加
#define isfinite(x) __builtin_isfinite(x)

diff qd_config.h ~/src/sdpa-dd-7.1.8/external/i/QD/include/qd/qd_config.h
< #define QD_ISFINITE(x) __extension__ ({ __typeof (x) __x = (x); __builtin_expect (!std::isnan(__x - __x), 1); })
---
> #define QD_ISFINITE(x) std::isfinite(x)

2: ./configure
make
を行う

3: make の最後で -L/opt/cuda/3.2/lib64 を追加する。

g++ -g -I/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/spooles/work/internal -DUseMETIS=0 -I/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/QD/include -I./mpack -g -O2 -D_REENTRANT -O2 -funroll-all-loops -o sdpa_dd sdpa_dd-sdpa_chordal.o sdpa_dd-sdpa_dataset.o sdpa_dd-sdpa_io.o sdpa_dd-sdpa_jordan.o sdpa_dd-sdpa_linear.o sdpa_dd-sdpa_main.o sdpa_dd-sdpa_newton.o sdpa_dd-sdpa_parts.o sdpa_dd-sdpa_struct.o sdpa_dd-sdpa_tool.o cuda/Rgemm.o sdpa_dd-Rpotrf.o sdpa_dd-Rsyrk.o sdpa_dd-Rtrsm.o sdpa_dd-Rdot.o sdpa_dd-Raxpy.o sdpa_dd-Rtrmm.o sdpa_dd-Rtrsv.o sdpa_dd-iMlaenv.o sdpa_dd-Rlamch.o sdpa_dd-Rlascl.o sdpa_dd-Rsytrd.o sdpa_dd-Rsterf.o sdpa_dd-Rorgtr.o sdpa_dd-Rlatrd.o sdpa_dd-Rsyr2k.o sdpa_dd-Rsytd2.o sdpa_dd-Rlanst.o sdpa_dd-Rlae2.o sdpa_dd-Rlapy2.o sdpa_dd-Rlasrt.o sdpa_dd-Rorgql.o sdpa_dd-Rorgqr.o sdpa_dd-Rsymv.o sdpa_dd-Rlarfg.o sdpa_dd-Rsyr2.o sdpa_dd-Rlassq.o sdpa_dd-Rorg2l.o sdpa_dd-Rlarft.o sdpa_dd-Rlarfb.o sdpa_dd-Rorg2r.o sdpa_dd-Rnrm2.o sdpa_dd-Rlarf.o sdpa_dd-Rger.o sdpa_dd-Mxerbla.o sdpa_dd-Rpotf2.o sdpa_dd-Mlsame.o sdpa_dd-Rscal.o sdpa_dd-Rcopy.o sdpa_dd-Rgemv.o sdpa_dd-Rtrmv.o sdpa_dd-Rsteqr.o sdpa_dd-Rlaset.o sdpa_dd-Rlaev2.o sdpa_dd-Rlasr.o sdpa_dd-Rlartg.o sdpa_dd-Rswap.o sdpa_dd-Rsyev.o sdpa_dd-Rlansy.o sdpa_dd-Mutils.o -lpthread -L/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/SPOOLES/lib -lspooles -L/home/usr1/11ITA100/src/sdpa-dd-7.1.8/external/i/QD/lib -lqd -L/opt/cuda/3.2/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcudart -lm -lpthread

4: 以下のように実行ができる（はず）。

./sdpa_dd ~/src/sdplib/theta2.dat-s out
SDPA-DD start at Mon Jan 30 01:22:07 2012
data is /home/usr1/11ITA100/src/sdplib/theta2.dat-s : sparse
parameter is ./param.sdpa
out is out

DENSE computations
mu thetaP thetaD objP objD alphaP alphaD beta
0 1.0e+04 1.0e+00 1.0e+00 -0.00e+00 +1.00e+04 1.0e+00 8.7e-01 2.00e-01
1 1.8e+03 0.0e+00 1.3e-01 +1.42e+02 +5.90e+03 7.6e-01 7.6e-01 2.00e-01
2 5.6e+02 1.2e-34 3.1e-02 +1.82e+02 +5.95e+02 8.1e-01 8.1e-01 2.00e-01
3 1.4e+02 1.8e-34 5.7e-03 +2.39e+02 +5.96e+01 9.1e-01 9.1e-01 2.00e-01
4 1.9e+01 2.1e-34 4.9e-04 +3.19e+02 +6.19e+00 1.0e+00 1.0e+00 2.00e-01
5 3.4e+00 2.7e-34 4.7e-36 +3.37e+02 +1.35e+00 7.6e-01 1.3e+01 1.00e-01
6 9.3e-01 2.6e-34 2.2e-35 +1.07e+02 +1.38e+01 7.9e-01 1.4e+00 1.00e-01
7 2.3e-01 2.0e-33 3.3e-35 +4.56e+01 +2.30e+01 7.6e-01 8.3e-01 1.00e-01
8 6.7e-02 3.9e-33 4.2e-36 +3.56e+01 +2.89e+01 8.1e-01 7.9e-01 1.00e-01
9 1.9e-02 3.9e-33 1.2e-35 +3.35e+01 +3.16e+01 8.5e-01 8.6e-01 1.00e-01
10 4.2e-03 3.9e-33 6.0e-36 +3.30e+01 +3.26e+01 9.0e-01 9.1e-01 1.00e-01
11 7.8e-04 3.9e-33 2.1e-35 +3.29e+01 +3.28e+01 9.4e-01 9.3e-01 1.00e-01
12 1.3e-04 3.9e-33 4.9e-36 +3.29e+01 +3.29e+01 9.6e-01 9.4e-01 1.00e-01
13 1.9e-05 7.8e-33 9.7e-36 +3.29e+01 +3.29e+01 9.9e-01 9.7e-01 1.00e-01
14 2.3e-06 7.8e-33 4.1e-35 +3.29e+01 +3.29e+01 1.0e+00 9.4e-01 1.00e-01
15 3.4e-07 7.8e-33 3.7e-35 +3.29e+01 +3.29e+01 1.1e+00 9.6e-01 1.00e-01
16 4.4e-08 7.8e-33 1.1e-34 +3.29e+01 +3.29e+01 1.1e+00 9.8e-01 1.00e-01
17 4.8e-09 7.8e-33 8.3e-36 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
18 4.9e-10 7.8e-33 1.3e-34 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
19 4.9e-11 7.8e-33 3.3e-33 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01
20 4.9e-12 7.8e-33 3.7e-34 +3.29e+01 +3.29e+01 1.1e+00 9.9e-01 1.00e-01

phase.value = pdOPT
Iteration = 20
mu = 4.9185714155822500e-12
relative gap = 1.4959536882616738e-11
gap = 4.9185714155822492e-10
digits = 1.0825081851153193e+01
objValPrimal = 3.28791690158170027189966087390464e+01
objValDual = 3.28791690153251455774383837785678e+01
p.feas.error = 7.8886090522101181e-31
d.feas.error = 3.7191633156061912e-30
relative eps = 4.9303806576313200e-32
total time = 6.480
main loop time = 6.460000
total time = 6.480000
file read time = 0.010000

コメント (2)

シンポジウム「数理最適化の40年と今後の展開」

2012年01月29日 02時06分27秒 | Weblog

シンポジウム「数理最適化の40年と今後の展開」
日時　2012年3月24日(土) 10:30-17:00
　　シンポジウム終了後に懇親会 (18:00-20:00) を予定しております。
場所　東京工業大学大岡山キャンパス

プログラム
10:30-15:30
水野眞治　（東京工業大学）　　「内点法と単体法について（仮）」
吉瀬章子　（筑波大学）　　　　「錐最適化1992/2012」
斎藤努　　（構造計画研究所）　「通信・物流業界における組合せ最適化の事例」
藤澤克樹　（中央大学）　　　　「最適化問題と大規模計算（仮）」
武田朗子　（慶應義塾大学）　　「太陽光発電システム導入に対するロバスト最適化の適用」
Martin Mevissen (IBM research, Dublin) Semidefinite Programming Relaxations for Nonconvex Problems
16:00-17:00
小島政和先生　（東京工業大学）
「研究生活４０年を超えて」
18:00-20:00
懇親会　会場：東京工業大学百周年記念館４Ｆ角笛　
懇親会については会費が必要となっております。会費は5000円の予定です。

TSUBAME 2.0 と SDPA, SDPARA　その５

2012年01月28日 01時40分12秒 | Weblog

同じ問題 tai10a.dat-s を OPT クラスタと TSUBAME スパコンで 16ノード x 12コアで実行してみたが、反復回数や実行時間が以下のようにかなり異なる。CPU も並列計算の規模も同じなのに、結果の値はかなり違っている。ただし Intel コンパイラのバージョン(前者 12.1.2 で後者は 11.1)とmvapich2 のバージョンが異なる（前者は 1.8で後者は 1.5.1)。

◯OPTクラスタ　
39 1.7e-09 6.2e-10 1.3e-11 +7.37e-01 +7.37e-01 1.3e-01 7.9e-02 1.00e-01
40 1.6e-09 6.2e-10 1.2e-11 +7.37e-01 +7.37e-01 1.3e-01 7.9e-02 1.00e-01

phase.value = pdOPT
Iteration = 40
mu = +1.5865462569448133e-09
relative gap = +5.5511151231257827e-16
gap = -5.5511151231257827e-16
digits = +1.5123174708309257e+01
objValPrimal = +7.3714842584791418e-01
objValDual = +7.3714842584791473e-01
p.feas.error = +6.2254989472350633e-08
d.feas.error = +1.1738616190193341e-09
total time = 6.396788

◯TSUBAME 2.0
29 4.4e-10 5.2e-10 5.8e-13 +7.37e-01 +7.37e-01 1.9e-01 2.5e-01 1.00e-01
30 3.5e-10 5.2e-10 4.4e-13 +7.37e-01 +7.37e-01 1.9e-01 2.5e-01 1.00e-01

phase.value = pdOPT
Iteration = 30
mu = +3.4630401716388027e-10
relative gap = +0.0000000000000000e+00
gap = +0.0000000000000000e+00
digits = +inf
objValPrimal = +7.3714895105620593e-01
objValDual = +7.3714895105620593e-01
p.feas.error = +5.2877355561715889e-08
d.feas.error = +4.3517454961163469e-11
total time = 7.379908

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

○ OPT クラスタ
１：PowerEdge M1000e(ブレードエンクロージャー) x 1台
２：PowerEdge M710HD(ブレードサーバ) x 16台
ブレードサーバの仕様：
CPU : インテル(R) Xeon(R) プロセッサー X5670(2.93GHz、12MB キャッシュ、6.4 GT/s QPI) x 2個
メモリ： 128GB (16X8GB/2R/1333MHz/DDR3 RDIMM/CPUx2)
Disk : 73GB x 2(1台のみ 300GB x 2)
NIC : GbE x 1 & Inifiniband QDR(40Gbps) x 1

研究室とサーバ室の分離

2012年01月27日 01時34分26秒 | Weblog

今は研究室の中に計算サーバ等がたくさん設置してあります。数えるのも面倒なので何台あるのか数えたこともありません。

特にこのクラスタ計算機は負荷を掛けていきますと超絶にうるさくなります。ブレードサーバは静かと言われて購入したので、正直騙された気がしないわけでもないです。
よって騒音で学生に迷惑がかかりますので、休日や夜間しか実験が出来ません。

というわけで（本当の理由は CREST ですが）、現研究室を実質サーバ室として利用して、人間達は別の部屋に引っ越すことにしました。サーバ室ではもう遠慮の必要がなくなったので、今後、さらにサーバ達が増えていきます。

第１回 Graph CREST シンポジウム：情報更新５

2012年01月26日 15時59分26秒 | Weblog

これまでの情報をまとめました。

-----------------------------------------------------------------------------------------------------------------------------------------------------------
第１回 Graph CREST シンポジウムを以下の日程で開催致します。

MIP, LP, SDP などの数理計画問題（ソフトウェアベンチマークを含む）及び Graph500 や QAP 等の大規模最適化問題に対する最新の研究結果に関する講演を行います。
また、午後には LP と MIP に関するチュートリアルセッションも企画しています。

開催日：2012年3月23日（金）9:45 ～ 18:00
場所：中央大学後楽園キャンパス(東京都文京区春日 1-13-27) ６号館３階６３０１室
参加費：無料
主催：JST CREST ポストペタスケールシステムにおける超大規模グラフ最適化基盤

◯9:45 - 10:00 Katsuki Fujisawa (Chuo University)
Title: Introduction : CREST project and Workshops on Computational Aspects for Solving Large Scale Optimization Problems

◯10:00 - 10:30 Toyotaro Suzumura (Tokyo Institute of Technology)
Title: Performance Evaluation of Graph500 on Large-Scale Distributed Environment

◯10:30 - 11:00 Hans Mittelmann (Arizona State University, USA)
Title: Benchmarks for Optimization Software

Abstract: Under the title "Benchmarks for Optimization Software" we maintain extensive
evaluations of a large selection of optimization software. In this talk we
will report on the benchmarks in discrete optimization including those
based on MIPLIB 2010 as well as selected ones in continuous optimization.

[1] Benchmarks for Optimization Software, http://plato.asu.edu/bench.html
[2] MIPLIB 2010, http://miplib.zib.de, Math Prog Comp 3, 103-163 (2011)

◯11:00 - 11:15 Break

◯11:15 - 11:45 Toh Kim Chuan (National University of Singapore)
Title: Computational experience in solving large scale semidefinite programming.

Abstract: In this talk, we will report our computational experience
in solving large scale semidefinite programming based on two classes of methods.
The first class consists of inexact primal-dual interior-point methods for which
the linear system in each iteration is solved by a preconditioned Krylov subspace method.
The second class are nonlinear programming based
methods such as proximal-point method, alternating direction method of multiplier,
inexact accelerated proximal gradient method.
We will discuss the merits of the two classes of methods and present some
computational results.

◯11:45 - 12:15 Yasuaki Matsukawa and *Akiko Yoshise (University of Tsukuba)
Title: A Primal Barrier Function Phase I Algorithm for Nonsymmetric Conic Optimization Problems

Abstract: We call a positive semidefinite matrix whose elements are nonnegative a
{\em doubly nonnegative matrix}, and the set of those matrices the {\em
doubly nonnegative cone} (DNN cone).
The DNN cone is not symmetric but can be represented as the projection of
a symmetric cone embedded in a higher dimension.
In a previous paper, the authors demonstrated the efficiency of the DNN
relaxation using the symmetric cone representation of the DNN cone.
They showed that the DNN relaxation gives significantly tight bounds for a
class of quadratic assignment problems, but the computational time is not
affordable as long as we employ the symmetric cone representation.
They then suggested a primal barrier function approach for solving the DNN
optimization problem directly, instead of using the symmetric cone
representation.
However, most of existing studies on the primal barrier function approach
assume the availability of a feasible interior point.
This fact means that those studies are not inextricably tied to the
practical usage.
Motivated by these observations, we propose a primal barrier function
Phase I algorithm for solving conic optimization problems over the closed
convex cone $K$ having the following properties:
(a) $K$ is not necessarily symmetric, (b) a self-concordant function $f$
is defined over $\inter K$, and (c) its dual cone $K^*$ is not explicit or
is intractable, all of which are observed when $K$ is the DNN cone.
We analyze the algorithm and provide a sufficient condition for finite
termination.

◯12:15 - 13:30 Lunch

◯13:30 - 15:00 Tutorial Session 1
◎Timo Berthold (Zuse Institute Berlin, Germany)
Title: What is Linear and Mixed Integer Programming(LP/MIP)

Abstract: Linear programming (LP) means the optimization of a linear function
subject to a set of linear constraints. In mixed-integer programming
(MIP), additionally, some of the variables are required to take integer
values. Linear and mixed-integer programming are two of the most essential
techniques in theory and practice of mathematical optimization.
Nowadays, linear programs with hundreds of thousands of variables and
constraints can be solved efficiently. Although this is not generally true
for mixed-integer programming - which is NP-hard -, state-of-the-art
software often is able to solve large, practically relevant problems.
This talk is supposed to be an introduction to the computational aspects
of LP and MIP. We present three commonly used algorithms to solve
large-scale, practically relevant problems of these types: the simplex algorithm for
LPs, the general cutting plane method and the branch-and-bound algorithm
for IPs. We discuss some of their pitfalls and ruses and showcase a few
algorithmic enhancements the today's MIP solvers are equipped with.

◎Ambros Gleixner and Stefan Heinz (Zuse Institute Berlin, Germany)
Title: First steps with Zimpl and SCIP

Abstract: In this tutorial, we demonstrate the usage of the modeling language Zimpl
and the mixed integer programming solver SCIP. We show how to model two
different formulations for the well-known binpacking problem and discuss
their limitations. Feeding them into SCIP we compare their computational
performance within a standalone branch-and-cut solver.
Zimpl is an algebraic modeling language featuring exact arithmetic. SCIP is
branch-cut-and-price framework targeted towards the need of researchers. It
allows total control of the solution process and the access of detailed
information down to the guts of the solver. Both tools are part of the ZIB
Optimization Suite (http://zibopt.zib.de), which is free for academic use
and available in source code.

◎Stefan Heinz (Zuse Institute Berlin, Germany)
Title: Using SCIP as a branch-and-price framework

Abstract: Column generation is a technique to handle large-scale linear programs
efficiently. In practice, it is widely used to solve real world problems
within a branch-and-price approach. Examples are rolling stock roster
planning, duty scheduling, and vehicle scheduling.
In this talk we are focusing on the branch-and-price feature of SCIP. We
illustrate using the steel mill slab problem, how the framework SCIP is
easily customized for all needs of branch-and-price approach.
We first present the basic idea of column generation and
branch-and-price. Second, we show step-by-step, how the framework SCIP can
be applied as a branch-and-price framework. Finally, we discuss some
pitfalls of the branch-and-price method which are easily avoidable within
the SCIP framework.

◯15:30 - 15:45 Break

◯15:45 - 16:45 Tutorial Session 2
◎ Ambros Gleixner (Zuse Institute Berlin, Germany)
Title: Improving the accuracy of LP solvers

Abstract: We describe an iterative refinement procedure for computing extended precision or exact solutions
to linear programming problems (LPs). Arbitrarily precise solutions can be computed by solving a
sequence of closely related LPs with limited precision arithmetic. The LPs solved at iterations of
this algorithm share the same constraint matrix as the original problem instance and are transformed
only by modification of the objective function, right-hand side, and variable bounds.
Exact computation is used to compute and store the exact representation of the transformed
problems, while numeric computation is used for computing approximate LP solutions and applying
iterations of the simplex algorithm. At all steps of the algorithm the LP bases encountered in the
transformed problems correspond directly to LP bases in the original problem description.
We demonstrate that this algorithm is effective in practice for computing extended precision
solutions and that this leads to direct improvement of the best known methods for solving LPs
exactly over the rational numbers. A proof-of-concept implementation is done within the SoPlex LP
solver.

◎ Timo Berthold (Zuse Institute Berlin, Germany)
Title: Primal Heuristics for MIP

Abstract: In modern MIP-solvers like the state-of-the-art
branch-cut-and-price-framework SCIP, primal heuristics play a major role
in finding and improving feasible solutions at the early steps of the
solution process.
Primal heuristics in SCIP can be categorized in three groups:
* rounding and propagation heuristics
* diving and objective diving heuristics
* large neighborhood search heuristics
We give examples for each of these classes, concentrating on recently
propsed algorithms. Further, we discuss the question of how the quality of
a primal heuristic should be evaluated and introduce a new a new
performance measure, the "primal integral". It assess the impact of these
primal heuristics on the ability to find feasible solutions, in
particular early during search. Finally, we discuss some computational results.

◯16:45 - 17:00 Break

◯17:00 - 17:30 Domenico Salvagnin (University of Padova, Italy)
Title: Three ideas for the Quadratic Assignment Problem

Abstract: We address the exact solution of the famous esc instances of the quadratic assignment problem. These
are extremely hard instances that remained unsolved―even allowing for a tremendous computing
power―by using all previous techniques from the literature. During this challenging task we found
that three ideas were particularly useful, and qualified as a breakthrough for our approach. The
present talk is about describing these ideas and their impact in solving esc instances. Our method
was able to solve, in a matter of seconds or minutes on a single PC, all easy cases (all esc16* plus
esc32e and esc32g). The three very hard instances esc32c, esc32d and esc64a were solved in less than
half an hour, in total, on a single PC. We also report the solution, in about 5 hours, of tai64c. By
using a facility-flow splitting procedure, we were also able to solve to proven optimality, for the
first time, esc32h (in about 2 hours) as well as “the big fish” esc128 (to our great surprise, the
solution of the latter required just a few seconds on a single PC).

◯17:30 - 18:00 *Shunji Umetani (Osaka University), Masanao Arakawa (Fujitsu Limited) and Mutsunori Yagiura (Nagoya University)
Title: A heuristic algorithm for the set multicover problem with generalized upper bound constraints

Abstract: The set covering problem (SCP) is one of representative combinatorial
optimization problem, which has many practical applications, e.g.,
crew scheduling, vehicle routing, facility location and data analysis.
In this talk, we consider an extension of SCP introducing (i)
multicover and (ii) generalized upper bound (GUB) constraints, which
substantially extend the variety of its applications.
For the conventional SCP, it has been known that relaxed problems
give us a good device called the pricing method to reduce the number of
variables, and several efficient heuristic algorithms utilizing this
idea have been developed to solve very large-scale instances with up
to 5000 constraints and 1,000,000 variables.
However, GUB constraints often make the standard pricing method less
effective, because they prevent solutions from having highly evaluated
variables simultaneously.
To overcome this, we develop a hybrid approach of metaheuristics and
the pricing method, in which we propose an evaluation scheme of
variables based on penalty weights that are adaptively controlled
during the search of metaheuristic algorithm.
Another feature of our algorithm is an efficient implementation of
local search with the 2-flip neighborhood.
According to computational comparison on benchmark instances with the
latest MIP solvers, our algorithm performs quite efficiently for
various types of problem instances, especially for very large-scale
instances.

第１回 Graph CREST シンポジウム：情報更新４

2012年01月25日 02時34分00秒 | Weblog

第１回 Graph CREST シンポジウムに関する情報の更新です。一部の講演の Abstract を記載します。

開催日：2012年3月23日（金）9:45 ～ 18:00
場所：中央大学後楽園キャンパス６号館３階６３０１室
参加費：無料
主催：JST CREST ポストペタスケールシステムにおける超大規模グラフ最適化基盤

Title: Benchmarks for Optimization Software
H.D. Mittelmann
School of Mathematical and Statistical Sciences
Arizona State University, USA

Abstract:
Under the title "Benchmarks for Optimization Software" we maintain extensive
evaluations of a large selection of optimization software. In this talk we
will report on the benchmarks in discrete optimization including those
based on MIPLIB 2010 as well as selected ones in continuous optimization.

[1] Benchmarks for Optimization Software, http://plato.asu.edu/bench.html
[2] MIPLIB 2010, http://miplib.zib.de, Math Prog Comp 3, 103-163 (2011)

Title: What is Linear and Mixed Integer Programming(LP/MIP)
Timo Berthold
Zuse Institute Berlin, Germany

Abstract:
Linear programming (LP) means the optimization of a linear function
subject to a set of linear constraints. In mixed-integer programming
(MIP), additionally, some of the variables are required to take integer
values. Linear and mixed-integer programming are two of the most essential
techniques in theory and practice of mathematical optimization.
Nowadays, linear programs with hundreds of thousands of variables and
constraints can be solved efficiently. Although this is not generally true
for mixed-integer programming - which is NP-hard -, state-of-the-art
software often is able to solve large, practically relevant problems.
This talk is supposed to be an introduction to the computational aspects
of LP and MIP. We present three commonly used algorithms to solve
large-scale, practically relevant problems of these types: the simplex algorithm for
LPs, the general cutting plane method and the branch-and-bound algorithm
for IPs. We discuss some of their pitfalls and ruses and showcase a few
algorithmic enhancements the today's MIP solvers are equipped with.

Title: First steps with Zimpl and SCIP
Ambros Gleixner, Stefan Heinz
Zuse Institute Berlin, Germany

Abstract:
In this tutorial, we demonstrate the usage of the modeling language Zimpl
and the mixed integer programming solver SCIP. We show how to model two
different formulations for the well-known binpacking problem and discuss
their limitations. Feeding them into SCIP we compare their computational
performance within a standalone branch-and-cut solver.
Zimpl is an algebraic modeling language featuring exact arithmetic. SCIP is
branch-cut-and-price framework targeted towards the need of researchers. It
allows total control of the solution process and the access of detailed
information down to the guts of the solver. Both tools are part of the ZIB
Optimization Suite (http://zibopt.zib.de), which is free for academic use
and available in source code.

Title: Using SCIP as a branch-and-price framework
Stefan Heinz
Zuse Institute Berlin, Germany

Abstract:
Column generation is a technique to handle large-scale linear programs
efficiently. In practice, it is widely used to solve real world problems
within a branch-and-price approach. Examples are rolling stock roster
planning, duty scheduling, and vehicle scheduling.
In this talk we are focusing on the branch-and-price feature of SCIP. We
illustrate using the steel mill slab problem, how the framework SCIP is
easily customized for all needs of branch-and-price approach.
We first present the basic idea of column generation and
branch-and-price. Second, we show step-by-step, how the framework SCIP can
be applied as a branch-and-price framework. Finally, we discuss some
pitfalls of the branch-and-price method which are easily avoidable within
the SCIP framework.

Title: Improving the accuracy of LP solvers
Ambros Gleixner
Zuse Institute Berlin, Germany

Abstract:
We describe an iterative refinement procedure for computing extended precision or exact solutions
to linear programming problems (LPs). Arbitrarily precise solutions can be computed by solving a
sequence of closely related LPs with limited precision arithmetic. The LPs solved at iterations of
this algorithm share the same constraint matrix as the original problem instance and are transformed
only by modification of the objective function, right-hand side, and variable bounds.
Exact computation is used to compute and store the exact representation of the transformed
problems, while numeric computation is used for computing approximate LP solutions and applying
iterations of the simplex algorithm. At all steps of the algorithm the LP bases encountered in the
transformed problems correspond directly to LP bases in the original problem description.
We demonstrate that this algorithm is effective in practice for computing extended precision
solutions and that this leads to direct improvement of the best known methods for solving LPs
exactly over the rational numbers. A proof-of-concept implementation is done within the SoPlex LP
solver.

Title: Primal Heuristics for MIP
Timo Berthold
Zuse Institute Berlin, Germany

Abstract:
In modern MIP-solvers like the state-of-the-art
branch-cut-and-price-framework SCIP, primal heuristics play a major role
in finding and improving feasible solutions at the early steps of the
solution process.
Primal heuristics in SCIP can be categorized in three groups:
* rounding and propagation heuristics
* diving and objective diving heuristics
* large neighborhood search heuristics
We give examples for each of these classes, concentrating on recently
propsed algorithms. Further, we discuss the question of how the quality of
a primal heuristic should be evaluated and introduce a new a new
performance measure, the "primal integral". It assess the impact of these
primal heuristics on the ability to find feasible solutions, in
particular early during search. Finally, we discuss some computational results.

Title: Three ideas for the Quadratic Assignment Problem
Domenico Salvagnin
University of Padova, Italy

Abstract:
We address the exact solution of the famous esc instances of the quadratic assignment problem. These
are extremely hard instances that remained unsolved―even allowing for a tremendous computing
power―by using all previous techniques from the literature. During this challenging task we found
that three ideas were particularly useful, and qualified as a breakthrough for our approach. The
present talk is about describing these ideas and their impact in solving esc instances. Our method
was able to solve, in a matter of seconds or minutes on a single PC, all easy cases (all esc16* plus
esc32e and esc32g). The three very hard instances esc32c, esc32d and esc64a were solved in less than
half an hour, in total, on a single PC. We also report the solution, in about 5 hours, of tai64c. By
using a facility-flow splitting procedure, we were also able to solve to proven optimality, for the
first time, esc32h (in about 2 hours) as well as “the big fish” esc128 (to our great surprise, the
solution of the latter required just a few seconds on a single PC).

TSUBAME 2.0 と SDPA, SDPARA　その４

2012年01月24日 02時58分30秒 | Weblog

TSUBAME 2.0 のキューが激込みですので以下の問題を解くための SDPARA の実行もいつになるのかわからない状況です。昨日報告しましたように新しい ScaLAPACK 2.0.1 と SDPARA 7.3.3 の組み合わせの動作確認も行いましたが、ScaLAPACK 1.8.0 と性能差が無さそうですので、これまでの動作実績から SDPARA 7.3.3 + ScaLAPACK 1.8.0 の組み合わせを TSUBAME 2.0 のキューに投入します。

これが解ければ現時点での SDP 世界記録の更新となる。
◯QAPLIB (tai30a.dat-s) の DNN 緩和問題
mDIM = 379350
nBLOCK = 2
bLOCKsTRUCT = -485758 842

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF

ScaLAPACK 2.0.1 と SDPARA

2012年01月23日 03時14分14秒 | Weblog

ScaLAPACK の 2.0.1 が公開されている。

http://www.netlib.org/scalapack/

昨日の続きで ScaLAPACK 2.0.1 の ILP64 化を行った（これが結構大変）。ただし、内部に含まれる BLACS は使用しないので ILP64 化は行わない。

◯問題 nug14_r2.dat-s
SDPARA 7.3.3 + ScaLAPACK 1.8.0 = 133.05秒
SDPARA 7.3.3 + ScaLAPACK 2.1.0 = 131.42秒

◯問題 Be.1S.SV.pqgt1t2p.dat-s
SDPARA 7.3.3 + ScaLAPACK 1.8.0 = 171.96秒
SDPARA 7.3.3 + ScaLAPACK 2.1.0 = 173.22秒

今回は Intel コンパイラ(+ MKL) 12.1.2 を用いたが、両者(1.8.0 と 2.0.1)に性能差はほとんどない。

○ OPT クラスタ
１：PowerEdge M1000e(ブレードエンクロージャー) x 1台
２：PowerEdge M710HD(ブレードサーバ) x 16台
ブレードサーバの仕様：
CPU : インテル(R) Xeon(R) プロセッサー X5670(2.93GHz、12MB キャッシュ、6.4 GT/s QPI) x 2個
メモリ： 128GB (16X8GB/2R/1333MHz/DDR3 RDIMM/CPUx2)
Disk : 73GB x 2(1台のみ 300GB x 2)
NIC : GbE x 1 & Inifiniband QDR(40Gbps) x 1
OS : CentOS 5.7 for x86_64

ScaLAPACK 2.0.1

2012年01月22日 03時23分14秒 | Weblog

ScaLAPACK の 2.0.1 が公開されている。ScaLAPACK では 2.0.0 以降では BLACS も含まれている

http://www.netlib.org/scalapack/

ちなみに SDPARA では ScaLAPACK 1.8.0 と 2.0.1 でも性能差は見られない。

TSUBAME 2.0 上での SDPARA：Cholesky

2012年01月21日 10時13分42秒 | Weblog

昨年の11月からTSUBAME 2.0 上での CPU + GPU 構成のマシンで動作する SDPARA の開発を行っている。

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF

ここでは現在の Cholesky (CPU+GPU) での性能を元に以下の超巨大 SDP を解くと仮定して、性能見積りを行ってみる。詳細については以前の記事を参照。

◯予想：ノード数 420, Cholesky 分解(CPU + GPU) 611GFlops / ノード
420 * 611 * 0.9 = 230958 GFLOPS なので、6.0291e+17 / 2.30958e+14 * 40 = 約 104,420 秒 = 約 29 時間

第１回 Graph CREST シンポジウム：情報更新３

2012年01月20日 01時43分49秒 | Weblog

第１回 Graph CREST シンポジウムに関する情報の更新です。２４年度も年２回（１０月と３月）の開催を予定しています。

開催日：2012年3月23日（金）9:45 ～ 18:00
場所：中央大学後楽園キャンパス６号館３階６３０１室
参加費：無料
主催：JST CREST ポストペタスケールシステムにおける超大規模グラフ最適化基盤

9:45 - 10:00 Katsuki Fujisawa
Introduction :
CREST project and Workshops on Computational Aspects for Solving Large Scale Optimization Problems
10:00 - 10:30 Toyotaro Suzumura
Tentative titel:
Performance Evaluation of Graph500 on Large-Scale Distributed Environment
10:30 - 11:00 Hans Mittelmann
Tentative title: Benchmarks for Optimization Software
11:00 - 11:15 Break
11:15 - 11:45 TOH Kim Chuan
Computational experience in solving large scale semidefinite programmin
11:45 - 12:15 Yasuaki Matsukawa
A Primal Barrier Function Phase I Algorithm for Nonsymmetric Conic Optimization Problems
12:15 - 13:30 Lunch
13:30 - 15:00 Tutorial Session 1
- Timo Berthold
What is Linear and Mixed Integer Programming(LP/MIP)
- Ambros Gleixner, Stefan Heinz
How to model and solve LPs and MIPs (SCIP, ZIMPL)
- Stefan Heinz
Tentative title: Column Generation using SCIP
15:30 - 15:45 Break
15:45 - 16:45 Tutorial Session 2
- Ambros Gleixner
Improving the Accuracy of LP Solvers
- Timo Berthold
Primal Heuristics for MIP
16:45 - 17:00 Break
17:00 - 17:30 Domenico Salvagnin
Three ideas for the Quadratic Assignment Problem
17:30 - 18:00 Shunji Umetani
A heuristic algorithm for the set multicover problem with
　　　　　　　generalized upper bound constraints

CREST研究： SDPとMIP

2012年01月19日 02時36分35秒 | Weblog

JST CREST : ポストペタスケールシステムにおける超大規模グラフ最適化基盤の来年度の重点研究対象について

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
国際的な研究の潮流、適用範囲の広さや過去の研究実績を考慮して、以下の二つの最適化問題を対象とする。これらは現在最適化分野で最も注目を集めている数理計画問題である。

１：半正定値計画問題(SDP)
組合せ最適化, システムと制御, データ科学, 金融工学, 量子化学など非常に幅広い応用を持つ。また今後のエネルギー供給計画（スマートグリッド等）では非線形の複雑な最適化問題を扱う必要があり、これらの問題に対して強力な緩和値を算出できる SDP の高速計算技術の確立が急務とされている。
２：混合整数計画問題(MIP)
整数条件を扱うことができるため数理モデルの記述能力は非常に高く、最適化問題の中では最も多くの分野に適用されている。ただし問題の複雑度が変数の数に指数的に比例して増大するため、超大規模な並列性と効率的な探索能力を持ったソフトウェアの開発が急務とされている。

TSUBAME 2.0 と SDPA, SDPARA　その４

2012年01月18日 01時55分31秒 | Weblog

TSUBAME上で SDPARA を作成する場合では、OpenMPI と MVAPICH の両者どちらでも用いることができる。以下のように環境変数を設定することで両者の MPI を切り替えることができる。性能的にはともかく、超大規模問題を解くときには MVAPICH を用いないと大変なことになる。

OpenMPI1.4.2+intel
export PATH=/usr/apps/openmpi/1.4.2/intel/bin:$PATH
export LD_LIBRARY_PATH=/usr/apps/openmpi/1.4.2/intel/lib:$LD_LIBRARY_PATH

MVAPICH1.5.1+intel
export PATH=/usr/apps/mvapich2/1.5.1/intel/bin:$PATH
export LD_LIBRARY_PATH=/usr/apps/mvapich2/1.5.1/intel/lib:$LD_LIBRARY_PATH

TSUBAME 2.0 と SDPA, SDPARA　その３

2012年01月17日 02時22分34秒 | Weblog

そろそろ TSUBAME 2.0 に SDPARA のジョブを投入して超大規模問題の最適解を狙います。

◯QAPLIB (tai30a.dat-s) の DNN 緩和問題
mDIM = 379350
nBLOCK = 2
bLOCKsTRUCT = -485758 842

この問題では N = 379,350 なので、Cholesky 分解の FLOPS 値は = 1.8197e+16 となります。TSUBAME 2.0 300 ノード(CPU のみ)で Cholesky 分解の性能効率 40%, SDP は 40 反復必要と仮定しますと、300 * 140GF * 0.40 = 1.6800e+13 FLOPS ですので、1.8197e+16 / 1.6800e+13 * 40 = 43,326秒 = 約１２時間となります。
現在、開発中の SDPARA が完成するとこの４倍程度高速化されるはずですが、こちらはもう少し時間がかかりそうです。

◯東工大 TSUBAME 2.0
HP Proliant SL390s G7 1408台
HP Proliant SL390s G7
CPU: Intel Xeon 2.93GHz 6コア×2ソケット = 12コア(Hyperthreading時 = 24コア)
GPU: NVIDIA Tesla M2050 3GPU
Memory: 54GB (一部は96GB)
SSD: 120GB (一部は240GB)
ネットワーク: QDR InfiniBand x 2 = 80Gbps

◯１ノードあたりの性能（倍精度)
CPU 140GF(2.93GHz) + GPU 1545GF = 1685GF
CPU 153GF(3.2GHz : TB) + GPU 1545GF = 1698GF

アクセス
閲覧	538	PV
訪問者	267	IP
トータル
閲覧	5,022,363	PV
訪問者	1,275,615	IP

	ブログを読むだけ。毎月の訪問日数に応じてポイント進呈
	gooブロガーの今日のひとこと
	訪問者数に応じてdポイント最大1,000pt当たる！
	goo blogは20周年を迎えました！

最適化問題に対する超高速＆安定計算

大規模最適化問題、グラフ探索、機械学習やデジタルツインなどの研究のお話が中心

TSUBAME 2.0 と SDPA, SDPARA　その６

SDPA-DDとGPUとTSUBAME　その３

シンポジウム「数理最適化の40年と今後の展開」

TSUBAME 2.0 と SDPA, SDPARA　その５

研究室とサーバ室の分離

第１回 Graph CREST シンポジウム：情報更新５

第１回 Graph CREST シンポジウム：情報更新４

TSUBAME 2.0 と SDPA, SDPARA　その４

ScaLAPACK 2.0.1 と SDPARA

ScaLAPACK 2.0.1

TSUBAME 2.0 上での SDPARA：Cholesky

第１回 Graph CREST シンポジウム：情報更新３

CREST研究： SDPとMIP

TSUBAME 2.0 と SDPA, SDPARA　その４

TSUBAME 2.0 と SDPA, SDPARA　その３

カレンダー

Twitter

最新記事

検索

バックナンバー

ブックマーク

文字サイズ変更

アクセス状況

goo blog おすすめ

goo blog お知らせ

2012年1月
日	月	火	水	木	金	土
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

大規模最適化問題、グラフ探索、機械学習やデジタルツインなどの研究のお話が中心

カレンダー

Twitter

最新記事

検索

ログイン

バックナンバー

ブックマーク

文字サイズ変更

アクセス状況

goo blog おすすめ

goo blog お知らせ