How to implement a deep bidirectional LSTM with Keras?

38,115

Solution 1

Well, I got the answer for the issue posted on the Keras issues. Hope this would be useful to anyone who look for this kind of approach. How to implement deep bidirectional -LSTM

Solution 2

model.add(Bidirectional(LSTM(64)))

Keras example

Solution 3

You can use keras.layers.wrappers.Bidirectional. Official manual can be referenced here, https://keras.io/layers/wrappers/#bidirectional

Solution 4

Now designing BiLSTM is easier. A new class Bidirectional is added as per official doc here: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Bidirectional

For training result & full code

Share:
38,115
udani
Author by

udani

Updated on July 27, 2022

Comments

  • udani
    udani over 1 year

    I am trying to implement a LSTM based speech recognizer. So far I could set up bidirectional LSTM (i think it is working as a bidirectional LSTM) by following the example in Merge layer. Now I want to try it with another bidirectional LSTM layer, which make it a deep bidirectional LSTM. But I am unable to figure out how to connect the output of the previously merged two layers into a second set of LSTM layers. I don't know whether it is possible with Keras. Hope someone can help me with this.

    Code for my single layer bidirectional LSTM is as follows

    left = Sequential()
    left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
                   forget_bias_init='one', return_sequences=True, activation='tanh',
                   inner_activation='sigmoid', input_shape=(99, 13)))
    right = Sequential()
    right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
                   forget_bias_init='one', return_sequences=True, activation='tanh',
                   inner_activation='sigmoid', input_shape=(99, 13), go_backwards=True))
    
    model = Sequential()
    model.add(Merge([left, right], mode='sum'))
    
    model.add(TimeDistributedDense(nb_classes))
    model.add(Activation('softmax'))
    
    sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
    model.compile(loss='categorical_crossentropy', optimizer=sgd)
    print("Train...")
    model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)
    

    Dimensions of my x and y values are as follows.

    (100, 'train sequences')
    (20, 'test sequences')
    ('X_train shape:', (100, 99, 13))
    ('X_test shape:', (20, 99, 13))
    ('y_train shape:', (100, 99, 11))
    ('y_test shape:', (20, 99, 11))