pe.data.image package
Submodules
pe.data.image.camelyon17 module
- class pe.data.image.camelyon17.Camelyon17(split='train', root_dir='data', res=64)[source]
Bases:
Data
The Camelyon17 dataset.
- __init__(split='train', root_dir='data', res=64)[source]
Constructor.
- Parameters:
split (str, optional) – The split of the dataset. It should be either “train”, “val”, or “test”, defaults to “train”
root_dir (str, optional) – The root directory to save the dataset, defaults to “data”
res (int, optional) – The resolution of the images, defaults to 64
- Raises:
ValueError – If the split is invalid
pe.data.image.cat module
- class pe.data.image.cat.Cat(root_dir='data', res=512)[source]
Bases:
Data
The Cat dataset.
- URL = 'https://www.kaggle.com/api/v1/datasets/download/fjxmlzn/cat-cookie-doudou'
The URL of the dataset
pe.data.image.cifar10 module
pe.data.image.image module
- pe.data.image.image._list_image_files_recursively(data_dir)[source]
List all image files in a directory recursively. Adapted from https://github.com/openai/improved-diffusion/blob/main/improved_diffusion/image_datasets.py
- pe.data.image.image.load_image_folder(path, image_size, class_cond=True, num_images=-1, num_workers=10, batch_size=1000)[source]
Load a image dataset from a folder that contains image files. The folder can be nested arbitrarily. The image file names must be in the format of “{class_name without ‘_’}_{suffix in any string}.ext”. The “ext” can be “jpg”, “jpeg”, “png”, or “gif”. The class names will be extracted from the file names before the first “_”. If class_cond is False, the class names will be ignored and all images will be treated as the same class with class name “None”.
- Parameters:
path (str) – The path to the root folder that contains the image files
image_size (int) – The size of the images. Images will be resized to this size
class_cond (bool, optional) – Whether to treat the loaded dataset as class conditional, defaults to True
num_images (int, optional) – The number of images to load. If -1, load all images. Defaults to -1
num_workers (int, optional) – The number of workers to use for loading the images, defaults to 10
batch_size (int, optional) – The batch size to use for loading the images, defaults to 1000
- Returns:
The loaded data
- Return type: