dcase_models.data.FramesAudio

class dcase_models.data.FramesAudio(sequence_time=1.0, sequence_hop_time=0.5, audio_win=1024, audio_hop=680, sr=22050, n_fft=1024, pad_mode='reflect')[source]

Bases: dcase_models.data.feature_extractor.FeatureExtractor

FramesAudio feature extractor.

Load the audio signal, convert it into time-short frames, and create sequences (overlapped windows).

Parameters:
pad_mode : str or None, default=’reflect’

Mode of padding applied to the audio signal. This argument is passed to librosa.util.fix_length for padding the signal. If pad_mode is None, no padding is applied.

__init__(sequence_time=1.0, sequence_hop_time=0.5, audio_win=1024, audio_hop=680, sr=22050, n_fft=1024, pad_mode='reflect')[source]

Initialize the FeatureExtractor

Methods

__init__([sequence_time, sequence_hop_time, …]) Initialize the FeatureExtractor
calculate(file_name) Loads an audio file and calculates features
check_if_extracted(dataset) Checks if the features of each file in dataset was calculated.
check_if_extracted_path(path) Checks if the features saved in path were calculated.
convert_to_sequences(audio_representation)
extract(dataset) Extracts features for each file in dataset.
get_features_path(dataset) Returns the path to the features folder.
get_shape([length_sec]) Calls calculate() with a dummy signal of length length_sec and returns the shape of the feature representation.
load_audio(file_name[, mono, …]) Loads an audio signal and converts it to mono if needed
pad_audio(audio)
set_as_extracted(path) Saves a json file with self.__dict__.
calculate(file_name)[source]

Loads an audio file and calculates features

Parameters:
file_name : str

Path to the audio file

Returns:
ndarray

feature representation of the audio signal

check_if_extracted(dataset)

Checks if the features of each file in dataset was calculated.

Calls check_if_extracted_path for each path in the dataset.

Parameters:
path : str

Path to the features folder

Returns:
bool

True if the features were already extracted.

check_if_extracted_path(path)

Checks if the features saved in path were calculated.

Compare if the features were calculated with the same parameters of self.__dict__.

Parameters:
path : str

Path to the features folder

Returns:
bool

True if the features were already extracted.

convert_to_sequences(audio_representation)
extract(dataset)

Extracts features for each file in dataset.

Call calculate() for each file in dataset and save the result into the features path.

Parameters:
dataset : Dataset

Instance of the dataset.

get_features_path(dataset)

Returns the path to the features folder.

Parameters:
dataset : Dataset

Instance of the dataset.

Returns:
features_path : str

Path to the features folder.

get_shape(length_sec=10.0)

Calls calculate() with a dummy signal of length length_sec and returns the shape of the feature representation.

Parameters:
length_sec : float

Duration in seconds of the test signal

Returns:
tuple

Shape of the feature representation

load_audio(file_name, mono=True, change_sampling_rate=True)

Loads an audio signal and converts it to mono if needed

Parameters:
file_name : str

Path to the audio file

mono : bool

if True, only returns left channel

change_sampling_rate : bool

if True, the audio signal is re-sampled to self.sr

Returns:
array

audio signal

pad_audio(audio)
set_as_extracted(path)

Saves a json file with self.__dict__.

Useful for checking if the features files were calculated with same parameters.

Parameters:
path : str

Path to the JSON file