最適化問題に対する超高速&安定計算

大規模最適化問題、グラフ探索、機械学習やデジタルツインなどの研究のお話が中心

消えた GPU その3

2016年09月22日 00時31分18秒 | Weblog
やっぱり GPU 1 ( Tesla K40m 43:00.0)は異常らしいです。

----------] 12 tests from NesterovSolverTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN ] NesterovSolverTest/3.TestNesterovLeastSquaresUpdateWithWeightDecay
F0922 00:20:58.611543 20598 common.cpp:162] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0) CUBLAS_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
@ 0x7f8f2d790e6d (unknown)
@ 0x7f8f2d792ced (unknown)
@ 0x7f8f2d790a5c (unknown)
@ 0x7f8f2d79363e (unknown)
@ 0x7f8f245bf8d3 caffe::Caffe::SetDevice()
@ 0x7f8f2455d98a caffe::P2PSync<>::InternalThreadEntry()
@ 0x7f8f245a57e0 caffe::InternalThread::entry()
@ 0x7f8f2b29824a (unknown)
@ 0x7f8f23c8adc5 start_thread
@ 0x7f8f239b7ced __clone
make: *** [runtest] 中止 (コアダンプしました)


$ nvidia-smi
Thu Sep 22 00:05:41 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.39 Driver Version: 352.39 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m Off | 0000:42:00.0 Off | 0 |
| N/A 53C P0 143W / 235W | 4344MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K40m Off | 0000:43:00.0 Off | 0 |
| N/A 26C P8 19W / 235W | 22MiB / 11519MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K40m Off | 0000:81:00.0 Off | 0 |
| N/A 52C P0 136W / 235W | 4220MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K40m Off | 0000:82:00.0 Off | 0 |
| N/A 29C P8 20W / 235W | 22MiB / 11519MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 47114 C python2 122MiB |
| 0 54369 C ...tmp/akirat/caffe.new2_5/build/tools/caffe 4193MiB |
| 2 53199 C ...tmp/akirat/caffe.new2_5/build/tools/caffe 4193MiB |
+-----------------------------------------------------------------------------+
コメント
  • X
  • Facebookでシェアする
  • はてなブックマークに追加する
  • LINEでシェアする