dcase_models.model.A_CRNN¶

class dcase_models.model.A_CRNN(model=None, model_path=None, metrics=['sed'], n_classes=10, n_frames_cnn=64, n_freq_cnn=128, cnn_nb_filt=128, cnn_pool_size=[5, 2, 2], rnn_nb=[32, 32], fc_nb=[32], dropout_rate=0.5, n_channels=0, final_activation='softmax', sed=False, bidirectional=False)[source]¶

Bases: dcase_models.model.container.KerasModelContainer

KerasModelContainer for A_CRNN model.

S. Adavanne, P. Pertilä, T. Virtanen “Sound event detection using spatial features and convolutional recurrent neural network” International Conference on Acoustics, Speech, and Signal Processing. 2017. https://arxiv.org/pdf/1706.02291.pdf

Parameters:

n_classes : int, default=10

Number of classes (dimmension output).

n_frames_cnn : int or None, default=64

Length of the input (number of frames of each sequence).

n_freq_cnn : int, default=128

Number of frequency bins. The model’s input has shape (n_frames, n_freqs).

cnn_nb_filt : int, default=128

Number of filters used in convolutional layers.

cnn_pool_size : tuple, default=(5, 2, 2)

Pooling dimmension for maxpooling layers.

rnn_nb : list, default=[32, 32]

Number of units in each recursive layer.

fc_nb : list, default=[32]

Number of units in each dense layer.

dropout_rate : float, default=0.5

Dropout rate.

n_channels : int, default=0

Number of input channels

0 : mono signals.: Input shape = (n_frames_cnn, n_freq_cnn)
1 : mono signals.: Input shape = (n_frames_cnn, n_freq_cnn, 1)
2 : stereo signals.: Input shape = (n_frames_cnn, n_freq_cnn, 2)
n > 2 : multi-representations.: Input shape = (n_frames_cnn, n_freq_cnn, n_channels)

final_activation : str, default=’softmax’

Activation of the last layer.

sed : bool, default=False

If sed is True, the output is frame-level. If False the output is time averaged.

bidirectional : bool, default=False

If bidirectional is True, the recursive layers are bidirectional.

Notes

Code based on Adavanne’s implementation https://github.com/sharathadavanne/sed-crnn

Examples

>>> from dcase_models.model.models import A_CRNN
>>> model_container = A_CRNN()
>>> model_container.model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input (InputLayer)           (None, 64, 128)           0
_________________________________________________________________
lambda (Lambda)              (None, 64, 128, 1)        0
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 64, 128, 128)      1280
_________________________________________________________________
batch_normalization_7 (Batch (None, 64, 128, 128)      512
_________________________________________________________________
activation_4 (Activation)    (None, 64, 128, 128)      0
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 64, 25, 128)       0
_________________________________________________________________
dropout_9 (Dropout)          (None, 64, 25, 128)       0
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 64, 25, 128)       147584
_________________________________________________________________
batch_normalization_8 (Batch (None, 64, 25, 128)       100
_________________________________________________________________
activation_5 (Activation)    (None, 64, 25, 128)       0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 64, 12, 128)       0
_________________________________________________________________
dropout_10 (Dropout)         (None, 64, 12, 128)       0
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 64, 12, 128)       147584
_________________________________________________________________
batch_normalization_9 (Batch (None, 64, 12, 128)       48
_________________________________________________________________
activation_6 (Activation)    (None, 64, 12, 128)       0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 64, 6, 128)        0
_________________________________________________________________
dropout_11 (Dropout)         (None, 64, 6, 128)        0
_________________________________________________________________
reshape_2 (Reshape)          (None, 64, 768)           0
_________________________________________________________________
gru_3 (GRU)                  (None, 64, 32)            76896
_________________________________________________________________
gru_4 (GRU)                  (None, 64, 32)            6240
_________________________________________________________________
time_distributed_6 (TimeDist (None, 64, 32)            1056
_________________________________________________________________
dropout_12 (Dropout)         (None, 64, 32)            0
_________________________________________________________________
time_distributed_7 (TimeDist (None, 64, 10)            330
_________________________________________________________________
mean (Lambda)                (None, 10)                0
_________________________________________________________________
strong_out (Activation)      (None, 10)                0
=================================================================
Total params: 381,630
Trainable params: 381,300
Non-trainable params: 330
_________________________________________________________________

Attributes:	model : keras.models.Model Keras model.

__init__(model=None, model_path=None, metrics=['sed'], n_classes=10, n_frames_cnn=64, n_freq_cnn=128, cnn_nb_filt=128, cnn_pool_size=[5, 2, 2], rnn_nb=[32, 32], fc_nb=[32], dropout_rate=0.5, n_channels=0, final_activation='softmax', sed=False, bidirectional=False)[source]¶

Methods

`__init__`([model, model_path, metrics, …])
`build`()	Builds the CRNN Keras model.
`check_if_model_exists`(folder, **kwargs)	Checks if the model already exits in the path.
`cut_network`(layer_where_to_cut)	Cuts the network at the layer passed as argument.
`evaluate`(data_test, **kwargs)	Evaluates the keras model using X_test and Y_test.
`fine_tuning`(layer_where_to_cut[, …])	Create a new model for fine-tuning.
`get_available_intermediate_outputs`()	Return a list of available intermediate outputs.
`get_intermediate_output`(output_ix_name, inputs)	Return the output of the model in a given layer.
`get_number_of_parameters`()	Missing docstring here
`load_model_from_json`(folder, **kwargs)	Loads a model from a model.json file in the path given by folder.
`load_model_weights`(weights_folder)	Loads self.model weights in weights_folder/best_weights.hdf5.
`load_pretrained_model_weights`([weights_folder])	Loads pretrained weights to self.model weights.
`save_model_json`(folder)	Saves the model to a model.json file in the given folder path.
`save_model_weights`(weights_folder)	Saves self.model weights in weights_folder/best_weights.hdf5.
`train`(data_train, data_val[, weights_path, …])	Trains the keras model using the data and paramaters of arguments.

build()[source]¶: Builds the CRNN Keras model.

check_if_model_exists(folder, **kwargs)¶

Checks if the model already exits in the path.

Check if the folder/model.json file exists and includes the same model as self.model.

Parameters:	folder : str Path to the folder to check.

cut_network(layer_where_to_cut)¶

Cuts the network at the layer passed as argument.

Parameters:	layer_where_to_cut : str or int Layer name (str) or index (int) where cut the model.
Returns:	keras.models.Model Cutted model.

evaluate(data_test, **kwargs)¶

Evaluates the keras model using X_test and Y_test.

Parameters:	X_test : ndarray 3D array with mel-spectrograms of test set. Shape = (N_instances, N_hops, N_mel_bands) Y_test : ndarray 2D array with the annotations of test set (one hot encoding). Shape (N_instances, N_classes) scaler : Scaler, optional Scaler objet to be applied if is not None.
Returns:	float evaluation’s accuracy list list of annotations (ground_truth) list list of model predictions

fine_tuning(layer_where_to_cut, new_number_of_classes=10, new_activation='softmax', freeze_source_model=True, new_model=None)¶

Create a new model for fine-tuning.

Cut the model in the layer_where_to_cut layer and add a new fully-connected layer.

Parameters:

layer_where_to_cut : str or int: Name (str) of index (int) of the layer where cut the model. This layer is included in the new model.
new_number_of_classes : int: Number of units in the new fully-connected layer (number of classes).
new_activation : str: Activation of the new fully-connected layer.
freeze_source_model : bool: If True, the source model is set to not be trainable.
new_model : Keras Model: If is not None, this model is added after the cut model. This is useful if you want add more than a fully-connected layer.

get_available_intermediate_outputs()¶

Return a list of available intermediate outputs.

Return a list of model’s layers.

Returns:	list of str List of layers names.

get_intermediate_output(output_ix_name, inputs)¶

Return the output of the model in a given layer.

Cut the model in the given layer and predict the output for the given inputs.

Returns:	ndarray Output of the model in the given layer.

get_number_of_parameters()¶: Missing docstring here

load_model_from_json(folder, **kwargs)¶

Loads a model from a model.json file in the path given by folder. The model is load in self.model attribute.

Parameters:	folder : str Path to the folder that contains model.json file

load_model_weights(weights_folder)¶

Loads self.model weights in weights_folder/best_weights.hdf5.

Parameters:	weights_folder : str Path to save the weights file.

load_pretrained_model_weights(weights_folder='./pretrained_weights')¶

Loads pretrained weights to self.model weights.

Parameters:	weights_folder : str Path to load the weights file

save_model_json(folder)¶

Saves the model to a model.json file in the given folder path.

Parameters:	folder : str Path to the folder to save model.json file

save_model_weights(weights_folder)¶

Saves self.model weights in weights_folder/best_weights.hdf5.

Parameters:	weights_folder : str Path to save the weights file

train(data_train, data_val, weights_path='./', optimizer='Adam', learning_rate=0.001, early_stopping=100, considered_improvement=0.01, losses='categorical_crossentropy', loss_weights=[1], sequence_time_sec=0.5, metric_resolution_sec=1.0, label_list=[], shuffle=True, **kwargs_keras_fit)¶

Trains the keras model using the data and paramaters of arguments.

Parameters:

X_train : ndarray: 3D array with mel-spectrograms of train set. Shape = (N_instances, N_hops, N_mel_bands)
Y_train : ndarray: 2D array with the annotations of train set (one hot encoding). Shape (N_instances, N_classes)
X_val : ndarray: 3D array with mel-spectrograms of validation set. Shape = (N_instances, N_hops, N_mel_bands)
Y_val : ndarray: 2D array with the annotations of validation set (one hot encoding). Shape (N_instances, N_classes)
weights_path : str: Path where to save the best weights of the model in the training process
weights_path : str: Path where to save log of the training process
loss_weights : list: List of weights for each loss function (‘categorical_crossentropy’, ‘mean_squared_error’, ‘prototype_loss’)
optimizer : str: Optimizer used to train the model
learning_rate : float: Learning rate used to train the model
batch_size : int: Batch size used in the training process
epochs : int: Number of training epochs
fit_verbose : int: Verbose mode for fit method of Keras model