Recently, I try to use tensorflow to implement a cnn+ctc network base on the article Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks.
I
The fully connected layer should be applied per time step. It's like applying same dense layer per time step in recurrent neural network. For output of convolution layer, time step is width.
So for example, output shape would be:
It is expected shape for ctc_loss in tensorflow.