pe.data.image package

Submodules

pe.data.image.camelyon17 module

class pe.data.image.camelyon17.Camelyon17(split='train', root_dir='data', res=64)[source]

Bases: Data

The Camelyon17 dataset.

__init__(split='train', root_dir='data', res=64)[source]

Constructor.

Parameters:
  • split (str, optional) – The split of the dataset. It should be either “train”, “val”, or “test”, defaults to “train”

  • root_dir (str, optional) – The root directory to save the dataset, defaults to “data”

  • res (int, optional) – The resolution of the images, defaults to 64

Raises:

ValueError – If the split is invalid

pe.data.image.cat module

class pe.data.image.cat.Cat(root_dir='data', res=512)[source]

Bases: Data

The Cat dataset.

URL = 'https://www.kaggle.com/api/v1/datasets/download/fjxmlzn/cat-cookie-doudou'

The URL of the dataset

__init__(root_dir='data', res=512)[source]

Constructor.

Parameters:
  • root_dir (str, optional) – The root directory to save the dataset, defaults to “data”

  • res (int, optional) – The resolution of the images, defaults to 512

_download()[source]

Download the dataset if it does not exist.

_read_data()[source]

Read the data from the zip file.

pe.data.image.cifar10 module

class pe.data.image.cifar10.Cifar10(split='train')[source]

Bases: Data

The CIFAR10 dataset.

__init__(split='train')[source]

Constructor.

Parameters:

split (str, optional) – The split of the dataset. It should be either “train” or “test”, defaults to “train”

Raises:

ValueError – If the split is invalid

pe.data.image.image module

class pe.data.image.image.ImageDataset(folder, transform)[source]

Bases: Dataset

pe.data.image.image._list_image_files_recursively(data_dir)[source]

List all image files in a directory recursively. Adapted from https://github.com/openai/improved-diffusion/blob/main/improved_diffusion/image_datasets.py

pe.data.image.image.load_image_folder(path, image_size, class_cond=True, num_images=-1, num_workers=10, batch_size=1000)[source]

Load a image dataset from a folder that contains image files. The folder can be nested arbitrarily. The image file names must be in the format of “{class_name without ‘_’}_{suffix in any string}.ext”. The “ext” can be “jpg”, “jpeg”, “png”, or “gif”. The class names will be extracted from the file names before the first “_”. If class_cond is False, the class names will be ignored and all images will be treated as the same class with class name “None”.

Parameters:
  • path (str) – The path to the root folder that contains the image files

  • image_size (int) – The size of the images. Images will be resized to this size

  • class_cond (bool, optional) – Whether to treat the loaded dataset as class conditional, defaults to True

  • num_images (int, optional) – The number of images to load. If -1, load all images. Defaults to -1

  • num_workers (int, optional) – The number of workers to use for loading the images, defaults to 10

  • batch_size (int, optional) – The batch size to use for loading the images, defaults to 1000

Returns:

The loaded data

Return type:

pe.data.data.Data