2018年2月28日のブログ記事一覧-最適化問題に対する超高速＆安定計算

CUDA 9.1 Patch 1 (Released Jan 25, 2018) その２

2018年02月28日 00時31分19秒 | Weblog

CUDA 9.1 Patch 1 (Released Jan 25, 2018)

Patch 1 (Released Jan 25, 2018) Download (112.9 MB)
cuBLAS Patch Update: This update to CUDA 9.1 includes new GEMM kernels optimized for the Volta architecture and improved heuristics to select GEMM kernels for given input sizes.

前回の続きですが、matrixMulCUBLAS の方は相当速くなりました。。。

パッチ適用前
# ./matrixMulCUBLAS
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0

GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 3544.62 GFlop/s, Time= 0.055 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS

パッチ適用後
# ./matrixMulCUBLAS
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0

GPU Device 0: "Tesla V100-PCIE-16GB" with compute capability 7.0

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 7441.86 GFlop/s, Time= 0.026 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS

2018年2月
日	月	火	水	木	金	土
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28

アクセス
閲覧	580	PV
訪問者	350	IP
トータル
閲覧	5,059,946	PV
訪問者	1,295,925	IP

最適化問題に対する超高速＆安定計算

大規模最適化問題、グラフ探索、機械学習やデジタルツインなどの研究のお話が中心

CUDA 9.1 Patch 1 (Released Jan 25, 2018) その２

カレンダー

Twitter

最新記事

検索

ログイン

バックナンバー

ブックマーク

文字サイズ変更

アクセス状況

goo blog おすすめ

goo blog お知らせ