$ ./testing_dgemm
device 0: Tesla C2050, 1147.0 MHz clock, 2687.2 MB memory
Usage:
testing_dgemm [-NN|NT|TN|TT] [-N 1024]
Testing transA = N transB = N
M N K MAGMA GFLop/s CUBLAS GFlop/s error
==================================================================
1024 1024 1024 169.45 281.16 0.000000e+00
1280 1280 1280 171.90 291.17 0.000000e+00
1600 1600 1600 173.20 295.32 0.000000e+00
2000 2000 2000 160.28 284.45 0.000000e+00
2500 2500 2500 161.95 288.32 0.000000e+00
3125 3125 3125 165.44 272.65 0.000000e+00
3906 3906 3906 163.50 294.35 0.000000e+00
4882 4882 4882 164.37 291.75 0.000000e+00
6102 6102 6102 161.18 293.33 0.000000e+00
7627 7627 7627 165.16 298.62 0.000000e+00
9533 9533 9533 165.76 292.16 0.000000e+00
すごいな。
device 0: Tesla C2050, 1147.0 MHz clock, 2687.2 MB memory
Usage:
testing_dgemm [-NN|NT|TN|TT] [-N 1024]
Testing transA = N transB = N
M N K MAGMA GFLop/s CUBLAS GFlop/s error
==================================================================
1024 1024 1024 169.45 281.16 0.000000e+00
1280 1280 1280 171.90 291.17 0.000000e+00
1600 1600 1600 173.20 295.32 0.000000e+00
2000 2000 2000 160.28 284.45 0.000000e+00
2500 2500 2500 161.95 288.32 0.000000e+00
3125 3125 3125 165.44 272.65 0.000000e+00
3906 3906 3906 163.50 294.35 0.000000e+00
4882 4882 4882 164.37 291.75 0.000000e+00
6102 6102 6102 161.18 293.33 0.000000e+00
7627 7627 7627 165.16 298.62 0.000000e+00
9533 9533 9533 165.76 292.16 0.000000e+00
すごいな。