1ノードだけの比較で Scale 値も小さいことから、あまり大した比較ではないが、以下のように TESPS 値にはかなりの差がある。ちなみに両者の CPU は全く同じもの。ネットワーク次第だが、このままマルチノードへ拡張していったら何 GTEPS ぐらい出るのだろうか?
median_TEPS: 3.10677601976136351e+09 : TSUBAME 2.0
median_TEPS: 2.43110363832451296e+09 : OPT クラスタ
◯ TSUBAME 2.0
----------------------------------------------------------------------
CPU name is Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
freq / RAM is 2933.381 MHz / 53.17 GB
#cpu, #nodes, #cores is 24 2 12
COMPILER is GCC (GNU C Compiler) version 4.3.4
----------------------------------------------------------------------
scale, edgefactor is 22 16
energy_loop is disable
#threads, #NUMAs is 12 2
mpol_bind is ON(mmap with mbind(MPOL_BIND))
mem_interleave is OFF
switching parameter is 0.000350 (n ~= 1.468006e+03)
queue buffer size is 16384
----------------------------------------------------------------------
SCALE: 22
nvtx: 4194304
edgefactor: 16
terasize: 1.07374182399999998e-03
A: 5.69999999999999951e-01
B: 1.90000000000000002e-01
C: 1.90000000000000002e-01
D: 5.00000000000000444e-02
generation_time: 4.63479804992675781e+00
construction_time: 3.45396018028259277e+00
nbfs: 64
min_time: 2.03080177307128906e-02
firstquartile_time: 2.10850834846496582e-02
median_time: 2.16900110244750977e-02
thirdquartile_time: 2.34054327011108398e-02
max_time: 2.85382270812988281e-02
mean_time: 2.24080123007297516e-02
stddev_time: 1.87665045017972585e-03
min_nedge: 6.71081140000000000e+07
firstquartile_nedge: 6.71081140000000000e+07
median_nedge: 6.71081140000000000e+07
thirdquartile_nedge: 6.71081140000000000e+07
max_nedge: 6.71081140000000000e+07
mean_nedge: 6.71081140000000000e+07
stddev_nedge: 0.00000000000000000e+00
min_TEPS: 2.35151657490230417e+09
firstquartile_TEPS: 2.89759915151431465e+09
median_TEPS: 3.10677601976136351e+09
thirdquartile_TEPS: 3.18990973665092087e+09
max_TEPS: 3.30451326613275719e+09
harmonic_mean_TEPS: 2.99482672087851906e+09
harmonic_stddev_TEPS: 3.15995921852265522e+07
◯ OPT クラスタ
----------------------------------------------------------------------
Parallel Breadth-First Search for Graph500 Benchmark version 3.52
----------------------------------------------------------------------
CPU name is Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
freq / RAM is 2926.092 MHz / 125.97 GB
#cpu, #nodes, #cores is 24 2 12
COMPILER is GCC (GNU C Compiler) version 4.4.6
----------------------------------------------------------------------
scale, edgefactor is 22 16
energy_loop is disable
#threads, #NUMAs is 12 2
mpol_bind is ON(mmap with mbind(MPOL_BIND))
mem_interleave is OFF
switching parameter is 0.000350 (n ~= 1.468006e+03)
queue buffer size is 16384
----------------------------------------------------------------------
SCALE: 22
nvtx: 4194304
edgefactor: 16
terasize: 1.07374182399999998e-03
A: 5.69999999999999951e-01
B: 1.90000000000000002e-01
C: 1.90000000000000002e-01
D: 5.00000000000000444e-02
generation_time: 4.92600226402282715e+00
construction_time: 4.21614789962768555e+00
nbfs: 64
min_time: 2.52830982208251953e-02
firstquartile_time: 2.69842743873596191e-02
median_time: 2.77304649353027344e-02
thirdquartile_time: 3.07412147521972656e-02
max_time: 3.64937782287597656e-02
mean_time: 2.86859124898910522e-02
stddev_time: 2.71162063548880793e-03
min_nedge: 6.71081140000000000e+07
firstquartile_nedge: 6.71081140000000000e+07
median_nedge: 6.71081140000000000e+07
thirdquartile_nedge: 6.71081140000000000e+07
max_nedge: 6.71081140000000000e+07
mean_nedge: 6.71081140000000000e+07
stddev_nedge: 0.00000000000000000e+00
min_TEPS: 1.83889192232537603e+09
firstquartile_TEPS: 2.21152116616008377e+09
median_TEPS: 2.43110363832451296e+09
thirdquartile_TEPS: 2.48950771186180830e+09
max_TEPS: 2.65426782010142851e+09
harmonic_mean_TEPS: 2.33941012068725300e+09
harmonic_stddev_TEPS: 2.78609775751486160e+07
median_TEPS: 3.10677601976136351e+09 : TSUBAME 2.0
median_TEPS: 2.43110363832451296e+09 : OPT クラスタ
◯ TSUBAME 2.0
----------------------------------------------------------------------
CPU name is Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
freq / RAM is 2933.381 MHz / 53.17 GB
#cpu, #nodes, #cores is 24 2 12
COMPILER is GCC (GNU C Compiler) version 4.3.4
----------------------------------------------------------------------
scale, edgefactor is 22 16
energy_loop is disable
#threads, #NUMAs is 12 2
mpol_bind is ON(mmap with mbind(MPOL_BIND))
mem_interleave is OFF
switching parameter is 0.000350 (n ~= 1.468006e+03)
queue buffer size is 16384
----------------------------------------------------------------------
SCALE: 22
nvtx: 4194304
edgefactor: 16
terasize: 1.07374182399999998e-03
A: 5.69999999999999951e-01
B: 1.90000000000000002e-01
C: 1.90000000000000002e-01
D: 5.00000000000000444e-02
generation_time: 4.63479804992675781e+00
construction_time: 3.45396018028259277e+00
nbfs: 64
min_time: 2.03080177307128906e-02
firstquartile_time: 2.10850834846496582e-02
median_time: 2.16900110244750977e-02
thirdquartile_time: 2.34054327011108398e-02
max_time: 2.85382270812988281e-02
mean_time: 2.24080123007297516e-02
stddev_time: 1.87665045017972585e-03
min_nedge: 6.71081140000000000e+07
firstquartile_nedge: 6.71081140000000000e+07
median_nedge: 6.71081140000000000e+07
thirdquartile_nedge: 6.71081140000000000e+07
max_nedge: 6.71081140000000000e+07
mean_nedge: 6.71081140000000000e+07
stddev_nedge: 0.00000000000000000e+00
min_TEPS: 2.35151657490230417e+09
firstquartile_TEPS: 2.89759915151431465e+09
median_TEPS: 3.10677601976136351e+09
thirdquartile_TEPS: 3.18990973665092087e+09
max_TEPS: 3.30451326613275719e+09
harmonic_mean_TEPS: 2.99482672087851906e+09
harmonic_stddev_TEPS: 3.15995921852265522e+07
◯ OPT クラスタ
----------------------------------------------------------------------
Parallel Breadth-First Search for Graph500 Benchmark version 3.52
----------------------------------------------------------------------
CPU name is Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
freq / RAM is 2926.092 MHz / 125.97 GB
#cpu, #nodes, #cores is 24 2 12
COMPILER is GCC (GNU C Compiler) version 4.4.6
----------------------------------------------------------------------
scale, edgefactor is 22 16
energy_loop is disable
#threads, #NUMAs is 12 2
mpol_bind is ON(mmap with mbind(MPOL_BIND))
mem_interleave is OFF
switching parameter is 0.000350 (n ~= 1.468006e+03)
queue buffer size is 16384
----------------------------------------------------------------------
SCALE: 22
nvtx: 4194304
edgefactor: 16
terasize: 1.07374182399999998e-03
A: 5.69999999999999951e-01
B: 1.90000000000000002e-01
C: 1.90000000000000002e-01
D: 5.00000000000000444e-02
generation_time: 4.92600226402282715e+00
construction_time: 4.21614789962768555e+00
nbfs: 64
min_time: 2.52830982208251953e-02
firstquartile_time: 2.69842743873596191e-02
median_time: 2.77304649353027344e-02
thirdquartile_time: 3.07412147521972656e-02
max_time: 3.64937782287597656e-02
mean_time: 2.86859124898910522e-02
stddev_time: 2.71162063548880793e-03
min_nedge: 6.71081140000000000e+07
firstquartile_nedge: 6.71081140000000000e+07
median_nedge: 6.71081140000000000e+07
thirdquartile_nedge: 6.71081140000000000e+07
max_nedge: 6.71081140000000000e+07
mean_nedge: 6.71081140000000000e+07
stddev_nedge: 0.00000000000000000e+00
min_TEPS: 1.83889192232537603e+09
firstquartile_TEPS: 2.21152116616008377e+09
median_TEPS: 2.43110363832451296e+09
thirdquartile_TEPS: 2.48950771186180830e+09
max_TEPS: 2.65426782010142851e+09
harmonic_mean_TEPS: 2.33941012068725300e+09
harmonic_stddev_TEPS: 2.78609775751486160e+07