cuDNN 7.2 を導入してみました。
What’s New in cuDNN 7.2?
Deep learning frameworks using cuDNN 7 and later, can leverage new features and performance of the Volta architecture to deliver up to 6x faster training performance compared to Pascal GPUs. cuDNN 7.2 highlights include:
TensorCore acceleration with FP32 inputs and outputs (previously restricted to FP16 input)
RNN cells now support more use cases with options for cell clipping and padding masks
Automatically select the best RNN implementation with RNN search API
cuDNN Release Notes v7.2.1
What’s New in cuDNN 7.2?
Deep learning frameworks using cuDNN 7 and later, can leverage new features and performance of the Volta architecture to deliver up to 6x faster training performance compared to Pascal GPUs. cuDNN 7.2 highlights include:
TensorCore acceleration with FP32 inputs and outputs (previously restricted to FP16 input)
RNN cells now support more use cases with options for cell clipping and padding masks
Automatically select the best RNN implementation with RNN search API
cuDNN Release Notes v7.2.1