Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

RotoBART

Running the script

Script arguemnts

Available model config arguments from script:

encoder_layers
encoder_ffn_dim
decoder_layers
decoder_ffn_dim
d_model
vocab_size
max_position_embeddings
encoder_layerdrop
decoder_layerdrop

Training Arguments:

testing : only uses 1 batch, for testing the script

adafactor: will enable adafactor, removing the command will revert to Adam

grad_accum: what value for gradient accumulation to use, default is 4

use_bf16: convert the model to bf16

colab_tpu: if running on a colab TPU

use_wandb: log using Weights & Biases (via Tensorboard)

save_strategy: whether or not to save model checkpoints based on steps or epoch

python rotobart/run_dnlm_flax.py \
  --output_dir rotobart_output \
  --overwrite_output_dir \
  --dataset_path rotobart/pile.py \
  --model_name_or_path rotobart \
  --tokenizer_name ./rotobart/vocab-2/the_pile.model \
  --shuffle_buffer_size 1000 \
  --do_train --do_eval \
  --max_seq_length 1024 \
  --encoder_layers 2 \
  --decoder_layers 2 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 2 \
  --logging_steps 8 \
  --num_train_steps 1000 \
  --eval_steps 1000 \
  --save_steps 1000 \
  --save_strategy steps \
  --num_eval_samples 100 \
  --warmup_steps 30 \
  --learning_rate 1e-4 \
  --use_wandb \
  --testing \
  --use_bf16 \
  --adafactor

alt

python3 run_dnlm_flax.py   --output_dir rotobart_output   --overwrite_output_dir   --dataset_path pile.py   --model_name_or_path rotobart   --tokenizer_name vocab-2/the_pile.model   --shuffle_buffer_size 1000   --do_train --do_eval --max_position_embeddings 2048  --max_seq_length 2048   --encoder_layers 6   --decoder_layers 6   --per_device_train_batch_size 1   --per_device_eval_batch_size 1   --logging_steps 100   --num_train_steps 50000   --eval_steps 2500   --save_steps 2500   --save_strategy steps   --num_eval_samples 5000   --warmup_steps 5000   --learning_rate 1e-4   --use_wandb   --use_bf16   --adafactor
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.