The 4th Graph500 List
今回は
当 CREST チームで 東工大スパコン TSUBAME2.0(TSUBAMEグランドチャレンジ 平成24年度春期を利用)と東大スパコン FX10 (2012年度 FX10 スーパーコンピュータシステム大規模 HPC チャレンジを利用)を用いて Graph500 用のソフトウェア開発と測定を行いました。以下に詳細な結果がありますが、東工大で実行した Graph500 のプログラムは世界初の大規模グラフ用 GPU 実装となります。
FX10
CPU のみの実装 : Scale 38, TEPS 358.10 GE/s
TSUBAME2.0
1:CPU のみの実装 : Scale 37, TEPS 202.68 GE/s
2:CPU + GPU : Scale 35, TEPS 317.09 GE/s
○東大 FX10 の測定結果
SCALE: 38
edgefactor: 16
NBFS: 64
graph_generation: 675.504
num_mpi_processes: 4096
construction_time: 1528.97
redistribution_time: 898.932
min_time: 11.5864
firstquartile_time: 11.9092
median_time: 12.2747
thirdquartile_time: 13.0804
max_time: 15.4257
mean_time: 12.5634
stddev_time: 0.851167
min_nedge: 4.3965448591e+12
firstquartile_nedge: 4.3965448591e+12
median_nedge: 4.3965448591e+12
thirdquartile_nedge: 4.3965448591e+12
max_nedge: 4.3965448591e+12
mean_nedge: 4.3965448591e+12
stddev_nedge: 0
min_TEPS: 2.85014e+11
firstquartile_TEPS: 3.36118e+11
median_TEPS: 3.58178e+11
thirdquartile_TEPS: 3.69171e+11
max_TEPS: 3.79459e+11
harmonic_mean_TEPS: 3.49947e+11
harmonic_stddev_TEPS: 2.98702e+09
min_validate: 318.125
firstquartile_validate: 636.232
median_validate: 647.2
thirdquartile_validate: 651.027
max_validate: 666.719
mean_validate: 627.201
stddev_validate: 57.6309
○ 東工大 TSUBAME2.0 の測定結果(CPUのみ)
SCALE: 37
edgefactor: 16
NBFS: 64
graph_generation: 129.33
num_mpi_processes: 4096
construction_time: 1213.19
redistribution_time: 358.358
min_time: 10.3144
firstquartile_time: 10.6738
median_time: 10.8494
thirdquartile_time: 11.3397
max_time: 21.2024
mean_time: 11.3639
stddev_time: 1.71612
min_nedge: 2.1990108574e+12
firstquartile_nedge: 2.1990108574e+12
median_nedge: 2.1990108574e+12
thirdquartile_nedge: 2.1990108574e+12
max_nedge: 2.1990108574e+12
mean_nedge: 2.1990108574e+12
stddev_nedge: 0
min_TEPS: 1.03715e+11
firstquartile_TEPS: 1.93922e+11
median_TEPS: 2.02684e+11
thirdquartile_TEPS: 2.0602e+11
max_TEPS: 2.13198e+11
harmonic_mean_TEPS: 1.93508e+11
harmonic_stddev_TEPS: 3.68169e+09
min_validate: 181.293
firstquartile_validate: 218.825
median_validate: 243.493
thirdquartile_validate: 270.473
max_validate: 337.786
mean_validate: 247.784
stddev_validate: 37.0499
○東工大 TSUBAME2.0 の測定結果(CPU + GPU)
SCALE: 35
edgefactor: 16
NBFS: 64
graph_generation: 31.6246
num_mpi_processes: 4096
construction_time: 536.357
redistribution_time: 221.01
min_time: 1.64677
firstquartile_time: 1.687
median_time: 1.73373
thirdquartile_time: 1.83146
max_time: 7.33673
mean_time: 1.84386
stddev_time: 0.702187
min_nedge: 5.4975230329e+11
firstquartile_nedge: 5.4975230329e+11
median_nedge: 5.4975230329e+11
thirdquartile_nedge: 5.4975230329e+11
max_nedge: 5.4975230329e+11
mean_nedge: 5.4975230329e+11
stddev_nedge: 0
min_TEPS: 7.49315e+10
firstquartile_TEPS: 3.00172e+11
median_TEPS: 3.17092e+11
thirdquartile_TEPS: 3.25875e+11
max_TEPS: 3.33836e+11
harmonic_mean_TEPS: 2.98153e+11
harmonic_stddev_TEPS: 1.43052e+10
min_validate: 31.4446
firstquartile_validate: 51.1974
median_validate: 61.0327
thirdquartile_validate: 74.4082
max_validate: 116.079
mean_validate: 62.5817
stddev_validate: 17.7331