timm documentation

Data

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.0.11).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Data

timm.data.create_dataset

< >

( name: str root: typing.Optional[str] = None split: str = 'validation' search_split: bool = True class_map: dict = None load_bytes: bool = False is_training: bool = False download: bool = False batch_size: int = 1 num_samples: typing.Optional[int] = None seed: int = 42 repeats: int = 0 input_img_mode: str = 'RGB' **kwargs )

Parameters

  • name — dataset name, empty is okay for folder based datasets
  • root — root folder of dataset (all)
  • split — dataset split (all)
  • search_split — search for split specific child fold from root so one can specify imagenet/ instead of /imagenet/val, etc on cmd line / config. (folder, torch/folder)
  • class_map — specify class -> index mapping via text file or dict (folder)
  • load_bytes — load data, return images as undecoded bytes (folder)
  • download — download dataset if not present and supported (HFDS, TFDS, torch)
  • is_training — create dataset in train mode, this is different from the split. For Iterable / TDFS it enables shuffle, ignored for other datasets. (TFDS, WDS)
  • batch_size — batch size hint for (TFDS, WDS)
  • seed — seed for iterable datasets (TFDS, WDS)
  • repeats — dataset repeats per iteration i.e. epoch (TFDS, WDS)
  • input_img_mode — Input image color conversion mode e.g. ‘RGB’, ‘L’ (folder, TFDS, WDS, HFDS)
  • **kwargs — other args to pass to dataset

Dataset factory method

In parentheses after each arg are the type of dataset supported for each arg, one of:

  • folder - default, timm folder (or tar) based ImageDataset
  • torch - torchvision based datasets
  • HFDS - Hugging Face Datasets
  • TFDS - Tensorflow-datasets wrapper in IterabeDataset interface via IterableImageDataset
  • WDS - Webdataset
  • all - any of the above

timm.data.create_loader

< >

( dataset: typing.Union[timm.data.dataset.ImageDataset, timm.data.dataset.IterableImageDataset] input_size: typing.Union[int, typing.Tuple[int, int], typing.Tuple[int, int, int]] batch_size: int is_training: bool = False no_aug: bool = False re_prob: float = 0.0 re_mode: str = 'const' re_count: int = 1 re_split: bool = False train_crop_mode: typing.Optional[str] = None scale: typing.Optional[typing.Tuple[float, float]] = None ratio: typing.Optional[typing.Tuple[float, float]] = None hflip: float = 0.5 vflip: float = 0.0 color_jitter: float = 0.4 color_jitter_prob: typing.Optional[float] = None grayscale_prob: float = 0.0 gaussian_blur_prob: float = 0.0 auto_augment: typing.Optional[str] = None num_aug_repeats: int = 0 num_aug_splits: int = 0 interpolation: str = 'bilinear' mean: typing.Tuple[float, ...] = (0.485, 0.456, 0.406) std: typing.Tuple[float, ...] = (0.229, 0.224, 0.225) num_workers: int = 1 distributed: bool = False crop_pct: typing.Optional[float] = None crop_mode: typing.Optional[str] = None crop_border_pixels: typing.Optional[int] = None collate_fn: typing.Optional[typing.Callable] = None pin_memory: bool = False fp16: bool = False img_dtype: dtype = torch.float32 device: device = device(type='cuda') use_prefetcher: bool = True use_multi_epochs_loader: bool = False persistent_workers: bool = True worker_seeding: str = 'all' tf_preprocessing: bool = False )

Parameters

  • dataset — The image dataset to load.
  • input_size — Target input size (channels, height, width) tuple or size scalar.
  • batch_size — Number of samples in a batch.
  • is_training — Return training (random) transforms.
  • no_aug — Disable augmentation for training (useful for debug).
  • re_prob — Random erasing probability.
  • re_mode — Random erasing fill mode.
  • re_count — Number of random erasing regions.
  • re_split — Control split of random erasing across batch size.
  • scale — Random resize scale range (crop area, < 1.0 => zoom in).
  • ratio — Random aspect ratio range (crop ratio for RRC, ratio adjustment factor for RKR).
  • hflip — Horizontal flip probability.
  • vflip — Vertical flip probability.
  • color_jitter — Random color jitter component factors (brightness, contrast, saturation, hue). Scalar is applied as (scalar,) * 3 (no hue).
  • color_jitter_prob — Apply color jitter with this probability if not None (for SimlCLR-like aug
  • grayscale_prob — Probability of converting image to grayscale (for SimCLR-like aug).
  • gaussian_blur_prob — Probability of applying gaussian blur (for SimCLR-like aug).
  • auto_augment — Auto augment configuration string (see auto_augment.py).
  • num_aug_repeats — Enable special sampler to repeat same augmentation across distributed GPUs.
  • num_aug_splits — Enable mode where augmentations can be split across the batch.
  • interpolation — Image interpolation mode.
  • mean — Image normalization mean.
  • std — Image normalization standard deviation.
  • num_workers — Num worker processes per DataLoader.
  • distributed — Enable dataloading for distributed training.
  • crop_pct — Inference crop percentage (output size / resize size).
  • crop_mode — Inference crop mode. One of [‘squash’, ‘border’, ‘center’]. Defaults to ‘center’ when None.
  • crop_border_pixels — Inference crop border of specified # pixels around edge of original image.
  • collate_fn — Override default collate_fn.
  • pin_memory — Pin memory for device transfer.
  • fp16 — Deprecated argument for half-precision input dtype. Use img_dtype.
  • img_dtype — Data type for input image.
  • device — Device to transfer inputs and targets to.
  • use_prefetcher — Use efficient pre-fetcher to load samples onto device.
  • use_multi_epochs_loader
  • persistent_workers — Enable persistent worker processes.
  • worker_seeding — Control worker random seeding at init.
  • tf_preprocessing — Use TF 1.0 inference preprocessing for testing model ports.

timm.data.create_transform

< >

( input_size: typing.Union[int, typing.Tuple[int, int], typing.Tuple[int, int, int]] = 224 is_training: bool = False no_aug: bool = False train_crop_mode: typing.Optional[str] = None scale: typing.Optional[typing.Tuple[float, float]] = None ratio: typing.Optional[typing.Tuple[float, float]] = None hflip: float = 0.5 vflip: float = 0.0 color_jitter: typing.Union[float, typing.Tuple[float, ...]] = 0.4 color_jitter_prob: typing.Optional[float] = None grayscale_prob: float = 0.0 gaussian_blur_prob: float = 0.0 auto_augment: typing.Optional[str] = None interpolation: str = 'bilinear' mean: typing.Tuple[float, ...] = (0.485, 0.456, 0.406) std: typing.Tuple[float, ...] = (0.229, 0.224, 0.225) re_prob: float = 0.0 re_mode: str = 'const' re_count: int = 1 re_num_splits: int = 0 crop_pct: typing.Optional[float] = None crop_mode: typing.Optional[str] = None crop_border_pixels: typing.Optional[int] = None tf_preprocessing: bool = False use_prefetcher: bool = False normalize: bool = True separate: bool = False )

Parameters

  • input_size — Target input size (channels, height, width) tuple or size scalar.
  • is_training — Return training (random) transforms.
  • no_aug — Disable augmentation for training (useful for debug).
  • train_crop_mode — Training random crop mode (‘rrc’, ‘rkrc’, ‘rkrr’).
  • scale — Random resize scale range (crop area, < 1.0 => zoom in).
  • ratio — Random aspect ratio range (crop ratio for RRC, ratio adjustment factor for RKR).
  • hflip — Horizontal flip probability.
  • vflip — Vertical flip probability.
  • color_jitter — Random color jitter component factors (brightness, contrast, saturation, hue). Scalar is applied as (scalar,) * 3 (no hue).
  • color_jitter_prob — Apply color jitter with this probability if not None (for SimlCLR-like aug).
  • grayscale_prob — Probability of converting image to grayscale (for SimCLR-like aug).
  • gaussian_blur_prob — Probability of applying gaussian blur (for SimCLR-like aug).
  • auto_augment — Auto augment configuration string (see auto_augment.py).
  • interpolation — Image interpolation mode.
  • mean — Image normalization mean.
  • std — Image normalization standard deviation.
  • re_prob — Random erasing probability.
  • re_mode — Random erasing fill mode.
  • re_count — Number of random erasing regions.
  • re_num_splits — Control split of random erasing across batch size.
  • crop_pct — Inference crop percentage (output size / resize size).
  • crop_mode — Inference crop mode. One of [‘squash’, ‘border’, ‘center’]. Defaults to ‘center’ when None.
  • crop_border_pixels — Inference crop border of specified # pixels around edge of original image.
  • tf_preprocessing — Use TF 1.0 inference preprocessing for testing model ports
  • use_prefetcher — Pre-fetcher enabled. Do not convert image to tensor or normalize.
  • normalize — Normalization tensor output w/ provided mean/std (if prefetcher not used).
  • separate — Output transforms in 3-stage tuple.

timm.data.resolve_data_config

< >

( args = None pretrained_cfg = None model = None use_test_size = False verbose = False )

< > Update on GitHub