dcase_models.model.A_CRNN¶
-
class
dcase_models.model.
A_CRNN
(model=None, model_path=None, metrics=['sed'], n_classes=10, n_frames_cnn=64, n_freq_cnn=128, cnn_nb_filt=128, cnn_pool_size=[5, 2, 2], rnn_nb=[32, 32], fc_nb=[32], dropout_rate=0.5, n_channels=0, final_activation='softmax', sed=False, bidirectional=False)[source]¶ Bases:
dcase_models.model.container.KerasModelContainer
KerasModelContainer for A_CRNN model.
S. Adavanne, P. Pertilä, T. Virtanen “Sound event detection using spatial features and convolutional recurrent neural network” International Conference on Acoustics, Speech, and Signal Processing. 2017. https://arxiv.org/pdf/1706.02291.pdf
Parameters: - n_classes : int, default=10
Number of classes (dimmension output).
- n_frames_cnn : int or None, default=64
Length of the input (number of frames of each sequence).
- n_freq_cnn : int, default=128
Number of frequency bins. The model’s input has shape (n_frames, n_freqs).
- cnn_nb_filt : int, default=128
Number of filters used in convolutional layers.
- cnn_pool_size : tuple, default=(5, 2, 2)
Pooling dimmension for maxpooling layers.
- rnn_nb : list, default=[32, 32]
Number of units in each recursive layer.
- fc_nb : list, default=[32]
Number of units in each dense layer.
- dropout_rate : float, default=0.5
Dropout rate.
- n_channels : int, default=0
Number of input channels
- 0 : mono signals.
Input shape = (n_frames_cnn, n_freq_cnn)
- 1 : mono signals.
Input shape = (n_frames_cnn, n_freq_cnn, 1)
- 2 : stereo signals.
Input shape = (n_frames_cnn, n_freq_cnn, 2)
- n > 2 : multi-representations.
Input shape = (n_frames_cnn, n_freq_cnn, n_channels)
- final_activation : str, default=’softmax’
Activation of the last layer.
- sed : bool, default=False
If sed is True, the output is frame-level. If False the output is time averaged.
- bidirectional : bool, default=False
If bidirectional is True, the recursive layers are bidirectional.
Notes
Code based on Adavanne’s implementation https://github.com/sharathadavanne/sed-crnn
Examples
>>> from dcase_models.model.models import A_CRNN >>> model_container = A_CRNN() >>> model_container.model.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input (InputLayer) (None, 64, 128) 0 _________________________________________________________________ lambda (Lambda) (None, 64, 128, 1) 0 _________________________________________________________________ conv2d_7 (Conv2D) (None, 64, 128, 128) 1280 _________________________________________________________________ batch_normalization_7 (Batch (None, 64, 128, 128) 512 _________________________________________________________________ activation_4 (Activation) (None, 64, 128, 128) 0 _________________________________________________________________ max_pooling2d_6 (MaxPooling2 (None, 64, 25, 128) 0 _________________________________________________________________ dropout_9 (Dropout) (None, 64, 25, 128) 0 _________________________________________________________________ conv2d_8 (Conv2D) (None, 64, 25, 128) 147584 _________________________________________________________________ batch_normalization_8 (Batch (None, 64, 25, 128) 100 _________________________________________________________________ activation_5 (Activation) (None, 64, 25, 128) 0 _________________________________________________________________ max_pooling2d_7 (MaxPooling2 (None, 64, 12, 128) 0 _________________________________________________________________ dropout_10 (Dropout) (None, 64, 12, 128) 0 _________________________________________________________________ conv2d_9 (Conv2D) (None, 64, 12, 128) 147584 _________________________________________________________________ batch_normalization_9 (Batch (None, 64, 12, 128) 48 _________________________________________________________________ activation_6 (Activation) (None, 64, 12, 128) 0 _________________________________________________________________ max_pooling2d_8 (MaxPooling2 (None, 64, 6, 128) 0 _________________________________________________________________ dropout_11 (Dropout) (None, 64, 6, 128) 0 _________________________________________________________________ reshape_2 (Reshape) (None, 64, 768) 0 _________________________________________________________________ gru_3 (GRU) (None, 64, 32) 76896 _________________________________________________________________ gru_4 (GRU) (None, 64, 32) 6240 _________________________________________________________________ time_distributed_6 (TimeDist (None, 64, 32) 1056 _________________________________________________________________ dropout_12 (Dropout) (None, 64, 32) 0 _________________________________________________________________ time_distributed_7 (TimeDist (None, 64, 10) 330 _________________________________________________________________ mean (Lambda) (None, 10) 0 _________________________________________________________________ strong_out (Activation) (None, 10) 0 ================================================================= Total params: 381,630 Trainable params: 381,300 Non-trainable params: 330 _________________________________________________________________
Attributes: - model : keras.models.Model
Keras model.
-
__init__
(model=None, model_path=None, metrics=['sed'], n_classes=10, n_frames_cnn=64, n_freq_cnn=128, cnn_nb_filt=128, cnn_pool_size=[5, 2, 2], rnn_nb=[32, 32], fc_nb=[32], dropout_rate=0.5, n_channels=0, final_activation='softmax', sed=False, bidirectional=False)[source]¶
Methods
__init__
([model, model_path, metrics, …])build
()Builds the CRNN Keras model. check_if_model_exists
(folder, **kwargs)Checks if the model already exits in the path. cut_network
(layer_where_to_cut)Cuts the network at the layer passed as argument. evaluate
(data_test, **kwargs)Evaluates the keras model using X_test and Y_test. fine_tuning
(layer_where_to_cut[, …])Create a new model for fine-tuning. get_available_intermediate_outputs
()Return a list of available intermediate outputs. get_intermediate_output
(output_ix_name, inputs)Return the output of the model in a given layer. get_number_of_parameters
()Missing docstring here load_model_from_json
(folder, **kwargs)Loads a model from a model.json file in the path given by folder. load_model_weights
(weights_folder)Loads self.model weights in weights_folder/best_weights.hdf5. load_pretrained_model_weights
([weights_folder])Loads pretrained weights to self.model weights. save_model_json
(folder)Saves the model to a model.json file in the given folder path. save_model_weights
(weights_folder)Saves self.model weights in weights_folder/best_weights.hdf5. train
(data_train, data_val[, weights_path, …])Trains the keras model using the data and paramaters of arguments. -
check_if_model_exists
(folder, **kwargs)¶ Checks if the model already exits in the path.
Check if the folder/model.json file exists and includes the same model as self.model.
Parameters: - folder : str
Path to the folder to check.
-
cut_network
(layer_where_to_cut)¶ Cuts the network at the layer passed as argument.
Parameters: - layer_where_to_cut : str or int
Layer name (str) or index (int) where cut the model.
Returns: - keras.models.Model
Cutted model.
-
evaluate
(data_test, **kwargs)¶ Evaluates the keras model using X_test and Y_test.
Parameters: - X_test : ndarray
3D array with mel-spectrograms of test set. Shape = (N_instances, N_hops, N_mel_bands)
- Y_test : ndarray
2D array with the annotations of test set (one hot encoding). Shape (N_instances, N_classes)
- scaler : Scaler, optional
Scaler objet to be applied if is not None.
Returns: - float
evaluation’s accuracy
- list
list of annotations (ground_truth)
- list
list of model predictions
-
fine_tuning
(layer_where_to_cut, new_number_of_classes=10, new_activation='softmax', freeze_source_model=True, new_model=None)¶ Create a new model for fine-tuning.
Cut the model in the layer_where_to_cut layer and add a new fully-connected layer.
Parameters: - layer_where_to_cut : str or int
Name (str) of index (int) of the layer where cut the model. This layer is included in the new model.
- new_number_of_classes : int
Number of units in the new fully-connected layer (number of classes).
- new_activation : str
Activation of the new fully-connected layer.
- freeze_source_model : bool
If True, the source model is set to not be trainable.
- new_model : Keras Model
If is not None, this model is added after the cut model. This is useful if you want add more than a fully-connected layer.
-
get_available_intermediate_outputs
()¶ Return a list of available intermediate outputs.
Return a list of model’s layers.
Returns: - list of str
List of layers names.
-
get_intermediate_output
(output_ix_name, inputs)¶ Return the output of the model in a given layer.
Cut the model in the given layer and predict the output for the given inputs.
Returns: - ndarray
Output of the model in the given layer.
-
get_number_of_parameters
()¶ Missing docstring here
-
load_model_from_json
(folder, **kwargs)¶ Loads a model from a model.json file in the path given by folder. The model is load in self.model attribute.
Parameters: - folder : str
Path to the folder that contains model.json file
-
load_model_weights
(weights_folder)¶ Loads self.model weights in weights_folder/best_weights.hdf5.
Parameters: - weights_folder : str
Path to save the weights file.
-
load_pretrained_model_weights
(weights_folder='./pretrained_weights')¶ Loads pretrained weights to self.model weights.
Parameters: - weights_folder : str
Path to load the weights file
-
save_model_json
(folder)¶ Saves the model to a model.json file in the given folder path.
Parameters: - folder : str
Path to the folder to save model.json file
-
save_model_weights
(weights_folder)¶ Saves self.model weights in weights_folder/best_weights.hdf5.
Parameters: - weights_folder : str
Path to save the weights file
-
train
(data_train, data_val, weights_path='./', optimizer='Adam', learning_rate=0.001, early_stopping=100, considered_improvement=0.01, losses='categorical_crossentropy', loss_weights=[1], sequence_time_sec=0.5, metric_resolution_sec=1.0, label_list=[], shuffle=True, **kwargs_keras_fit)¶ Trains the keras model using the data and paramaters of arguments.
Parameters: - X_train : ndarray
3D array with mel-spectrograms of train set. Shape = (N_instances, N_hops, N_mel_bands)
- Y_train : ndarray
2D array with the annotations of train set (one hot encoding). Shape (N_instances, N_classes)
- X_val : ndarray
3D array with mel-spectrograms of validation set. Shape = (N_instances, N_hops, N_mel_bands)
- Y_val : ndarray
2D array with the annotations of validation set (one hot encoding). Shape (N_instances, N_classes)
- weights_path : str
Path where to save the best weights of the model in the training process
- weights_path : str
Path where to save log of the training process
- loss_weights : list
List of weights for each loss function (‘categorical_crossentropy’, ‘mean_squared_error’, ‘prototype_loss’)
- optimizer : str
Optimizer used to train the model
- learning_rate : float
Learning rate used to train the model
- batch_size : int
Batch size used in the training process
- epochs : int
Number of training epochs
- fit_verbose : int
Verbose mode for fit method of Keras model