TSUBAME-KFC 上で 128GPU を用いて Graph500 の実行(大規模実行用のプログラム)を行ってみた。とりあえずは 9.5GTEPS ぐらいの性能。
◯ Graph500 & Scale 28
median_TEPS: 9.52748e+09
============= Result ==============
SCALE: 28
edgefactor: 16
NBFS: 64
graph_generation: 15.6297609806
num_mpi_processes: 128
construction_time: 22.2543370724
redistribution_time: 2.67794203758
min_time: 0.36158
firstquartile_time: 0.404948
median_time: 0.450794
thirdquartile_time: 0.484442
max_time: 0.674164
mean_time: 0.452209
stddev_time: 0.0611282
min_nedge: 4294927670
firstquartile_nedge: 4294927670
median_nedge: 4294927670
thirdquartile_nedge: 4294927670
max_nedge: 4294927670
mean_nedge: 4294927670
stddev_nedge: 0
min_TEPS: 6.37075e+09
firstquartile_TEPS: 8.86572e+09
median_TEPS: 9.52748e+09
thirdquartile_TEPS: 1.06061e+10
max_TEPS: 1.18782e+10
harmonic_mean_TEPS: 9.49766e+09
harmonic_stddev_TEPS: 1.61751e+08
min_validate: 4.47244
firstquartile_validate: 4.82371
median_validate: 4.94657
thirdquartile_validate: 5.06717
max_validate: 5.94372
mean_validate: 4.96742
stddev_validate: 0.250875
TSUBAME-KFC - LX 1U-4GPU/104Re-1G Cluster, Intel Xeon E5-2620v2 6C 2.100GHz, Infiniband FDR, NVIDIA K20x
◯ Graph500 & Scale 28
median_TEPS: 9.52748e+09
============= Result ==============
SCALE: 28
edgefactor: 16
NBFS: 64
graph_generation: 15.6297609806
num_mpi_processes: 128
construction_time: 22.2543370724
redistribution_time: 2.67794203758
min_time: 0.36158
firstquartile_time: 0.404948
median_time: 0.450794
thirdquartile_time: 0.484442
max_time: 0.674164
mean_time: 0.452209
stddev_time: 0.0611282
min_nedge: 4294927670
firstquartile_nedge: 4294927670
median_nedge: 4294927670
thirdquartile_nedge: 4294927670
max_nedge: 4294927670
mean_nedge: 4294927670
stddev_nedge: 0
min_TEPS: 6.37075e+09
firstquartile_TEPS: 8.86572e+09
median_TEPS: 9.52748e+09
thirdquartile_TEPS: 1.06061e+10
max_TEPS: 1.18782e+10
harmonic_mean_TEPS: 9.49766e+09
harmonic_stddev_TEPS: 1.61751e+08
min_validate: 4.47244
firstquartile_validate: 4.82371
median_validate: 4.94657
thirdquartile_validate: 5.06717
max_validate: 5.94372
mean_validate: 4.96742
stddev_validate: 0.250875
TSUBAME-KFC - LX 1U-4GPU/104Re-1G Cluster, Intel Xeon E5-2620v2 6C 2.100GHz, Infiniband FDR, NVIDIA K20x