timm/vit_base_patch16_224.augreg2_in21k_ft_in1k · Training parameters for augreg2

Oct 27, 2023

May I know the training parameters for this model? For example, something like

./distributed_train.sh 4 /data/imagenet --model vit_base_patch16_224.augreg2_in21k_ft_in1k --sched cosine --epochs 150 --warmup-epochs 5 --lr 0.4 --reprob 0.5 --remode pixel --batch-size 256 --amp -j 4

rwightman

PyTorch Image Models org Nov 3, 2023

aa: rand-m8-inc1-mstd101
amp: true
amp_dtype: float16
amp_impl: native
aot_autograd: false
apex_amp: false
aug_repeats: 0
aug_splits: 0
batch_size: 512
bce_loss: false
bce_target_thresh: null
bn_eps: null
bn_momentum: null
channels_last: false
checkpoint_hist: 10
class_map: ''
clip_grad: 2.0
clip_mode: norm
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: 1.0
cutmix: 1.0
cutmix_minmax: null
data_dir: /data/imagenet/
dataset: ''
dataset_download: false
decay_epochs: 100
decay_milestones:
- 30
- 60
decay_rate: 0.1
dist_bn: reduce
drop: 0.0
drop_block: null
drop_connect: null
drop_path: 0.1
epoch_repeats: 0.0
epochs: 50
eval_metric: top1
experiment: ''
fast_norm: false
fuser: ''
gp: null
grad_checkpointing: true
hflip: 0.5
img_size: null
in_chans: null
initial_checkpoint: ''
input_size: null
interpolation: ''
jsd_loss: false
layer_decay: 0.7
local_rank: 0
log_interval: 50
log_wandb: false
lr: 0.0002
lr_base: 0.1
lr_base_scale: ''
lr_base_size: 256
lr_cycle_decay: 0.5
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_k_decay: 1.0
lr_noise:
- 0.1
- 0.9
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 5.0e-07
mixup: 0.8
mixup_mode: batch
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: vit_base_patch16_224.augreg_in21k
model_ema: true
model_ema_decay: 0.9998
model_ema_force_cpu: false
momentum: 0.9
native_amp: false
no_aug: false
no_ddp_bb: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
opt: adamw
opt_betas: null
opt_eps: null
output: ''
patience_epochs: 10
pin_mem: false
pretrained: true
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.3
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: cosine
sched_on_updates: true
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
torchscript: false
train_interpolation: random
train_split: train
tta: 0
use_multi_epochs_loader: false
val_split: validation
validation_batch_size: null
vflip: 0.0
warmup_epochs: 10
warmup_lr: 0.0
warmup_prefix: true
weight_decay: 0.05
worker_seeding: all
workers: 8

rwightman

PyTorch Image Models org Nov 3, 2023

•

edited Nov 3, 2023

aa: rand-m8-inc1-mstd101
amp: true
amp_dtype: float16
amp_impl: native
aot_autograd: false
apex_amp: false
aug_repeats: 0
aug_splits: 0
batch_size: 512
bce_loss: false
bce_target_thresh: null
bn_eps: null
bn_momentum: null
channels_last: false
checkpoint_hist: 10
class_map: ''
clip_grad: 3.0
clip_mode: norm
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: 1.0
cutmix: 0.0
cutmix_minmax: null
data: /data/imagenet/
data_dir: /data/imagenet/
dataset: ''
dataset_download: false
decay_epochs: 100
decay_milestones:
- 30
- 60
decay_rate: 0.1
dist_bn: reduce
drop: 0.0
drop_block: null
drop_connect: null
drop_path: 0.1
dynamo: false
dynamo_backend: null
epoch_repeats: 0.0
epochs: 50
eval_metric: top1
experiment: ''
fast_norm: false
fuser: ''
gp: null
grad_checkpointing: true
hflip: 0.5
img_size: null
in_chans: null
initial_checkpoint: ''
input_size: null
interpolation: ''
jsd_loss: false
layer_decay: 0.75
local_rank: 0
log_interval: 50
log_wandb: false
lr: 0.0001
lr_base: 0.1
lr_base_scale: ''
lr_base_size: 256
lr_cycle_decay: 0.5
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_k_decay: 1.0
lr_noise:
- 0.1
- 1.0
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 5.0e-07
mixup: 0.3
mixup_mode: batch
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: convnext_small.in12k
model_ema: true
model_ema_decay: 0.9998
model_ema_force_cpu: false
momentum: 0.9
native_amp: false
no_aug: false
no_ddp_bb: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
opt: adamw
opt_betas: null
opt_eps: null
output: ''
patience_epochs: 10
pin_mem: false
pretrained: true
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.3
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: cosine
sched_on_updates: false
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
torchcompile: null
torchscript: false
train_interpolation: random
train_split: train
tta: 0
use_multi_epochs_loader: false
val_split: validation
validation_batch_size: null
vflip: 0.0
warmup_epochs: 10
warmup_lr: 1.0e-06
warmup_prefix: false
weight_decay: 0.05
worker_seeding: all
workers: 8

rwightman

PyTorch Image Models org Nov 3, 2023

•

edited Nov 3, 2023

@pyone those are two sets of hparams, a bit diff but same theme that ended up similar result for my 'augreg2' runs, basically re-finetuning the in21k models from 'How to train your vit' with better params, key ingredient is the layer-wise lr decay (lr_decay arg)

rwightman

PyTorch Image Models org Nov 3, 2023

that was either 4 or 8 gpus (to calc global batch size), I think it was 4...

deleted

Nov 4, 2023

Hey, how to get rid of Unknown model error
it seems that we can only upload models in the range that timm supports

How about uploading a customized model? U c, many researchers may come up with new model structure with new name in cv domain, how about them?

pyone

Nov 7, 2023

@rwightman Thank you very much. It is clear.

pyone changed discussion status to closed Nov 7, 2023