dcase_models.data.Dataset

class dcase_models.data.Dataset(dataset_path)[source]

Bases: object

Abstract base class to load and manage DCASE datasets.

Descendants of this class are defined to manage specific DCASE databases (see UrbanSound8k, ESC50)

Parameters:
dataset_path : str

Path to the dataset fold. This is the path to the folder where the complete dataset will be downloaded, decompressed and handled. It is expected to use a folder name that represents the dataset unambiguously (e.g. ../datasets/UrbanSound8k).

Examples

To create a new dataset, it is necessary to define a class that inherits from Dataset. Then is required to define the build, generate_file_lists, get_annotations and download (if online available) methods.

>>> from dcase_models.util import list_wav_files
>>> class TestDataset(Dataset):
>>>     def __init__(self, dataset_path):
>>>         super().__init__(dataset_path)
>>>     def build(self):
>>>         self.audio_path = os.path.join(self.dataset_path, 'audio')
>>>         self.fold_list = ["train", "validate", "test"]
>>>         self.label_list = ["cat", "dog"]
>>>     def generate_file_lists(self):
>>>         for fold in self.fold_list:
>>>             audio_folder = os.path.join(self.audio_path, fold)
>>>             self.file_lists[fold] = list_wav_files(audio_folder)
>>>     def get_annotations(self, file_path, features):
>>>         y = np.zeros((len(features), len(self.label_list)))
>>>         class_ix = int(os.path.basename(file_path).split('-')[1])
>>>         y[:, class_ix] = 1
>>>         return y
>>>     def download(self, force_download=False):
>>>         zenodo_url = "https://zenodo.org/record/123456/files"
>>>         zenodo_files = ["TestData.tar.gz"]
>>>         downloaded = super().download(
>>>             zenodo_url, zenodo_files, force_download
>>>         )
>>>         if downloaded:
>>>             move_all_files_to_parent(self.dataset_path, "TestData")
>>>             self.set_as_downloaded()
Attributes:
file_lists : dict

This dictionary stores the list of files for each fold. Dict of form: {fold_name : list_of_files}. e.g. {‘fold1’ : [file1, file2 …], ‘fold2’ : [file3, file4 …],}.

audio_path : str

Path to the audio folder, i.e {self.dataset_path}/audio. This attribute is defined in build()

fold_list : list

List of the pre-defined fold names. e.g. [‘fold1’, ‘fold2’, ‘fold3’, …]

label_list : list

List of class labels. e.g. [‘dog’, ‘siren’, …]

__init__(dataset_path)[source]

Init Dataset

Methods

__init__(dataset_path) Init Dataset
build() Builds the dataset.
change_sampling_rate(new_sr) Changes the sampling rate of each wav file in audio_path.
check_if_downloaded() Checks if the dataset was downloaded.
check_sampling_rate(sr) Checks if dataset was resampled before.
convert_to_wav([remove_original]) Converts each file in the dataset to wav format.
download(zenodo_url, zenodo_files[, …]) Downloads and decompresses the dataset from zenodo.
generate_file_lists() Creates file_lists, a dict that includes a list of files per fold.
get_annotations(file_path, features, …) Returns the annotations of the file in file_path.
get_audio_paths([sr]) Returns paths to the audio folder.
set_as_downloaded() Saves a download.txt file in dataset_path as a downloaded flag.
build()[source]

Builds the dataset.

Define specific attributes of the dataset. It’s mandatory to define audio_path, fold_list and label_list. Other attributes may be defined here (url, authors, etc.).

change_sampling_rate(new_sr)[source]

Changes the sampling rate of each wav file in audio_path.

Creates a new folder named audio_path{new_sr} (i.e audio22050) and converts each wav file in audio_path and save the result in the new folder.

Parameters:
sr : int

Sampling rate.

check_if_downloaded()[source]

Checks if the dataset was downloaded.

Just checks if exists download.txt file.

Further checks in the future.

check_sampling_rate(sr)[source]

Checks if dataset was resampled before.

For now, only checks if the folder {audio_path}{sr} exists and each wav file present in audio_path is also present in {audio_path}{sr}.

Parameters:
sr : int

Sampling rate.

Returns:
bool

True if the dataset was resampled before.

convert_to_wav(remove_original=False)[source]

Converts each file in the dataset to wav format.

If remove_original is False, the original files will be deleted

Parameters:
remove_original : bool

Remove original files.

download(zenodo_url, zenodo_files, force_download=False)[source]

Downloads and decompresses the dataset from zenodo.

Parameters:
zenodo_url : str

URL with the zenodo files. e.g. ‘https://zenodo.org/record/12345/files

zenodo_files : list of str

List of files. e.g. [‘file1.tar.gz’, ‘file2.tar.gz’, ‘file3.tar.gz’]

force_download : bool

If True, download the dataset even if was downloaded before.

Returns:
bool

True if the downloading process was successful.

generate_file_lists()[source]

Creates file_lists, a dict that includes a list of files per fold.

Each dataset has a different way of organizing the files. This function defines the dataset structure.

get_annotations(file_path, features, time_resolution)[source]

Returns the annotations of the file in file_path.

Parameters:
file_path : str

Path to the file

features : ndarray

nD array with the features of file_path

time_resolution : float

Time resolution of the features

Returns:
ndarray

Annotations of the file file_path Expected output shape: (features.shape[0], len(self.label_list))

get_audio_paths(sr=None)[source]

Returns paths to the audio folder.

If sr is None, return audio_path. Else, return {audio_path}{sr}.

Parameters:
sr : int or None, optional

Sampling rate.

Returns:
audio_path : str

Path to the root audio folder. e.g. DATASET_PATH/audio

subfolders : list of str

List of subfolders include in audio folder. Important when use AugmentedDataset. e.g. [‘{DATASET_PATH}/audio/original’]

set_as_downloaded()[source]

Saves a download.txt file in dataset_path as a downloaded flag.