yujiepan commited on
Commit
3841ad3
1 Parent(s): db9b31a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - glue
8
+ metrics:
9
+ - accuracy
10
+ model-index:
11
+ - name: yujiepan/bert-base-uncased-sst2-int8-unstructured80
12
+ results:
13
+ - task:
14
+ name: Text Classification
15
+ type: text-classification
16
+ dataset:
17
+ name: GLUE SST2
18
+ type: glue
19
+ config: sst2
20
+ split: validation
21
+ args: sst2
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.91284
26
+ pipeline_tag: text-classification
27
+ ---
28
+
29
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
+ should probably proofread and complete it, then remove this comment. -->
31
+
32
+ # Joint magnitude pruning, quantization and distillation on BERT-base/SST-2
33
+
34
+ This model conducts unstructured magnitude pruning, quantization and distillation at the same time on BERT-base when finetuning on the GLUE SST2 dataset.
35
+ It achieves the following results on the evaluation set:
36
+ - Torch accuracy: 0.9128
37
+ - OpenVINO IR accuracy: 0.9128
38
+ - Sparsity in transformer block linear layers: 0.80
39
+
40
+ ## Setup
41
+
42
+ ```
43
+ conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
44
+ pip install optimum[openvino,nncf]==1.7.0
45
+ # TODO
46
+ pip install wandb # optional
47
+ ```
48
+
49
+ ## NNCF config
50
+
51
+ See `nncf_config.json` in this repo.
52
+
53
+
54
+ ## Run
55
+
56
+ We use one card for training.
57
+
58
+ ```
59
+ NNCFCFG=/path/to/nncf/config
60
+ python run_glue.py \
61
+ --lr_scheduler_type cosine_with_restarts \
62
+ --cosine_cycle_ratios 11,6 \
63
+ --cosine_cycle_decays 1,1 \
64
+ --save_best_model_after_epoch -1 \
65
+ --save_best_model_after_sparsity 0.7999 \
66
+ --model_name_or_path textattack/bert-base-uncased-SST-2 \
67
+ --teacher_model_or_path yoshitomo-matsubara/bert-large-uncased-sst2 \
68
+ --distillation_temperature 2 \
69
+ --task_name sst2 \
70
+ --nncf_compression_config $NNCFCFG \
71
+ --distillation_weight 0.95 \
72
+ --output_dir /tmp/bert-base-uncased-sst2-int8-unstructured80-17epoch \
73
+ --run_name bert-base-uncased-sst2-int8-unstructured80-17epoch \
74
+ --overwrite_output_dir \
75
+ --do_train \
76
+ --do_eval \
77
+ --max_seq_length 128 \
78
+ --per_device_train_batch_size 32 \
79
+ --per_device_eval_batch_size 32 \
80
+ --learning_rate 5e-05 \
81
+ --optim adamw_torch \
82
+ --num_train_epochs 17 \
83
+ --logging_steps 1 \
84
+ --evaluation_strategy steps \
85
+ --eval_steps 250 \
86
+ --save_strategy steps \
87
+ --save_steps 250 \
88
+ --save_total_limit 1 \
89
+ --fp16 \
90
+ --seed 1
91
+ ```
92
+
93
+ The best model checkpoint is stored in the `best_model` folder. Here we only upload that checkpoint folder together with some config files.
94
+
95
+
96
+ ## inference
97
+
98
+ https://gist.github.com/yujiepan-work/c38dc4e56c7a9d803c42988f7b7d260a
99
+
100
+
101
+ ### Framework versions
102
+
103
+ - Transformers 4.26.0
104
+ - Pytorch 1.13.1+cu116
105
+ - Datasets 2.8.0
106
+ - Tokenizers 0.13.2
107
+
108
+ For a full description of the environment, please refer to `pip-requirements.txt` and `conda-requirements.txt`.