I\'m making a NER model with Bi-LSTM. I want to use Attention layers with it. I want to what is the right way to fit that Attent
NER
Bi-LSTM
Attention
Attent