Jinchen commited on
Commit
91a58cf
1 Parent(s): 6ff677e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -2
README.md CHANGED
@@ -26,7 +26,7 @@ should probably proofread and complete it, then remove this comment. -->
26
 
27
  # vqa
28
 
29
- This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the Graphcore/vqa-lxmert dataset.
30
  It achieves the following results on the evaluation set:
31
  - Loss: 0.0009
32
  - Accuracy: 0.7242
@@ -41,10 +41,39 @@ More information needed
41
 
42
  ## Training and evaluation data
43
 
44
- More information needed
45
 
46
  ## Training procedure
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  ### Training hyperparameters
49
 
50
  The following hyperparameters were used during training:
 
26
 
27
  # vqa
28
 
29
+ This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the [Graphcore/vqa-lxmert](https://huggingface.co/datasets/Graphcore/vqa-lxmert) dataset.
30
  It achieves the following results on the evaluation set:
31
  - Loss: 0.0009
32
  - Accuracy: 0.7242
 
41
 
42
  ## Training and evaluation data
43
 
44
+ [Graphcore/vqa-lxmert](https://huggingface.co/datasets/Graphcore/vqa-lxmert) dataset
45
 
46
  ## Training procedure
47
 
48
+ Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
49
+
50
+ Command line:
51
+
52
+ ```
53
+ python examples/language-modeling/run_clm.py \
54
+ --model_name_or_path gpt2 \
55
+ --ipu_config_name Graphcore/gpt2-small-ipu \
56
+ --dataset_name wikitext \
57
+ --dataset_config_name wikitext-103-raw-v1 \
58
+ --do_train \
59
+ --do_eval \
60
+ --num_train_epochs 10 \
61
+ --dataloader_num_workers 64 \
62
+ --per_device_train_batch_size 1 \
63
+ --per_device_eval_batch_size 1 \
64
+ --gradient_accumulation_steps 128 \
65
+ --output_dir /tmp/clm_output \
66
+ --logging_steps 5 \
67
+ --learning_rate 1e-5 \
68
+ --lr_scheduler_type linear \
69
+ --loss_scaling 16384 \
70
+ --weight_decay 0.01 \
71
+ --warmup_ratio 0.1 \
72
+ --ipu_config_overrides="embedding_serialization_factor=4,optimizer_state_offchip=true,inference_device_iterations=5" \
73
+ --dataloader_drop_last \
74
+ --pod_type pod16
75
+ ```
76
+
77
  ### Training hyperparameters
78
 
79
  The following hyperparameters were used during training: