What is the difference between CuDNNLSTM and LSTM in Keras?

前端 未结 3 1298
失恋的感觉
失恋的感觉 2021-01-30 12:57

In Keras, the high-level deep learning library, there are multiple types of recurrent layers; these include LSTM (Long short term memory) and CuD

相关标签:
3条回答
  • 2021-01-30 13:39

    GPUs are good for massive parallel computation, most of the linear algebra ops can be parallelized to improve performance, Vector operations like matrix multiplication and gradient descent can be applied to large matrices that are executed in parallel with GPU support. CUDA - Compute Unified Device Architecture provides an interface that allows vector ops to take advantage of GPU parallelism. CuDNN implements kernels for large matrix operations on GPU using CUDA.

    Here, CuDNNLSTM is designed for CUDA parallel processing and cannot run if there is no GPU. But LSTM is designed for normal CPUs. Faster time of execution is because of parallelism.

    0 讨论(0)
  • 2021-01-30 13:47

    Why don't you try it out for yourself and see? In my case, training a model with LSTM took 10mins 30seconds. Simply switching the call from LSTM() to CuDNNLSTM() took less than a minute.

    I also noticed that switching to CuDNNLSTM() speeds up model.evaluate() and model.predict() substantially as well.

    0 讨论(0)
  • 2021-01-30 13:56

    TL;DR; The difference is 15x speed up in model training time!

    Setup Steps

    Dependencies

    Performance Benchmark: Comparison of the standard test machines.
    1 iteration of Training on 612235 samples.

    keras.layers.LSTM Intel i5-4690 CPU only: 612235/612235 [==============================] - 3755s 6ms/step - loss: 2.7339 - acc: 0.5067 - val_loss: 2.1149 - val_acc: 0.6175

    GTX:950 & Intel i5-4690: 612235/612235 [==============================] - 1417s 2ms/step - loss: 2.7007 - acc: 0.5137 - val_loss: 2.0983 - val_acc: 0.6199

    2.5x gain with GPU.

    GTX:970 & Intel i5-4690: 612235/612235 [==============================] - 1322s 2ms/step - loss: 1.9214 - acc: 0.6442 - val_loss: 1.8808 - val_acc: 0.6461

    Ignorable gain with powerful GPU.

    RTX 2070 & Intel i7-9700K: 612235/612235 [==============================] - 1012s 2ms/step - loss: 2.7268 - acc: 0.5111 - val_loss: 2.1162 - val_acc: 0.6234

    Very minimal gain even with awesome HW upgrades!!!

    keras.layers.CuDNNLSTM RTX 2070 & Intel i7-9700K: 612235/612235 [==============================] - 69s 112us/step - loss: 1.9139 - acc: 0.6437 - val_loss: 1.8668 - val_acc: 0.6469

    54x gain over CPU!
    15x gain over traditional(non Cuda) LSTM implementation!

    0 讨论(0)
提交回复
热议问题