Student0809 commited on
Commit
0947ff8
·
verified ·
1 Parent(s): 8613355

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +3 -0
  2. asset/banner.png +3 -0
  3. docs/resources/dpo_data.png +3 -0
  4. docs/resources/web-ui.jpg +3 -0
  5. docs/source/Instruction/评测.md +269 -0
  6. docs/source_en/BestPractices/Rapidly-Training-VL-model.md +228 -0
  7. docs/source_en/Customization/Custom-model.md +35 -0
  8. docs/transformers/.github/ISSUE_TEMPLATE/bug-report.yml +134 -0
  9. docs/transformers/.github/ISSUE_TEMPLATE/config.yml +12 -0
  10. docs/transformers/.github/ISSUE_TEMPLATE/migration.yml +72 -0
  11. docs/transformers/.github/conda/build.sh +1 -0
  12. docs/transformers/.github/conda/meta.yaml +56 -0
  13. docs/transformers/.github/workflows/assign-reviewers.yml +26 -0
  14. docs/transformers/.github/workflows/build-docker-images.yml +393 -0
  15. docs/transformers/.github/workflows/build-nightly-ci-docker-images.yml +67 -0
  16. docs/transformers/.github/workflows/change_pr_to_draft.yml +25 -0
  17. docs/transformers/.github/workflows/check_failed_model_tests.yml +128 -0
  18. docs/transformers/.github/workflows/check_tiny_models.yml +82 -0
  19. docs/transformers/.github/workflows/doctest_job.yml +83 -0
  20. docs/transformers/.github/workflows/model_jobs.yml +142 -0
  21. docs/transformers/.github/workflows/model_jobs_amd.yml +128 -0
  22. docs/transformers/.github/workflows/new_model_pr_merged_notification.yml +68 -0
  23. docs/transformers/.github/workflows/push-important-models.yml +135 -0
  24. docs/transformers/.github/workflows/release-conda.yml +47 -0
  25. docs/transformers/.github/workflows/self-comment-ci.yml +416 -0
  26. docs/transformers/.github/workflows/self-nightly-past-ci-caller.yml +99 -0
  27. docs/transformers/.github/workflows/self-past-caller.yml +40 -0
  28. docs/transformers/.github/workflows/self-push-amd-mi210-caller.yml +25 -0
  29. docs/transformers/.github/workflows/self-push-amd-mi300-caller.yml +25 -0
  30. docs/transformers/.github/workflows/self-push-amd.yml +334 -0
  31. docs/transformers/.github/workflows/self-push.yml +652 -0
  32. docs/transformers/.github/workflows/self-scheduled-amd-caller.yml +14 -0
  33. docs/transformers/.github/workflows/self-scheduled-amd-mi210-caller.yml +55 -0
  34. docs/transformers/.github/workflows/self-scheduled.yml +598 -0
  35. docs/transformers/.github/workflows/ssh-runner.yml +113 -0
  36. docs/transformers/.github/workflows/stale.yml +29 -0
  37. docs/transformers/.github/workflows/trufflehog.yml +20 -0
  38. docs/transformers/.github/workflows/update_metdata.yml +27 -0
  39. docs/transformers/.github/workflows/upload_pr_documentation.yml +16 -0
  40. docs/transformers/README.md +322 -0
  41. docs/transformers/benchmark/README.md +49 -0
  42. docs/transformers/benchmark/__init__.py +0 -0
  43. docs/transformers/benchmark/benchmark.py +326 -0
  44. docs/transformers/benchmark/benchmarks_entrypoint.py +143 -0
  45. docs/transformers/benchmark/config/generation.yaml +57 -0
  46. docs/transformers/benchmark/default.yml +10 -0
  47. docs/transformers/benchmark/grafana_dashboard.json +2375 -0
  48. docs/transformers/benchmark/grafana_datasource.yaml +17 -0
  49. docs/transformers/benchmark/init_db.sql +33 -0
  50. docs/transformers/benchmark/llama.py +342 -0
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ asset/banner.png filter=lfs diff=lfs merge=lfs -text
37
+ docs/resources/web-ui.jpg filter=lfs diff=lfs merge=lfs -text
38
+ docs/resources/dpo_data.png filter=lfs diff=lfs merge=lfs -text
asset/banner.png ADDED

Git LFS Details

  • SHA256: aed7b0ac0bbb353df62f86b80e26eeab10fc69a0a49de161c544b51ab4ea9bea
  • Pointer size: 131 Bytes
  • Size of remote file: 381 kB
docs/resources/dpo_data.png ADDED

Git LFS Details

  • SHA256: 8d87ca58f3ac43a79836ba40f5d9cb788b8fabc26d717fb4e8eb1f1400f6598d
  • Pointer size: 131 Bytes
  • Size of remote file: 355 kB
docs/resources/web-ui.jpg ADDED

Git LFS Details

  • SHA256: 5e83bb4b4ecda9386286b99c6e83551f1dd1fdcdaf7be2efa3117208e3806000
  • Pointer size: 131 Bytes
  • Size of remote file: 182 kB
docs/source/Instruction/评测.md ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 评测
2
+
3
+ SWIFT支持了eval(评测)能力,用于对原始模型和训练后的模型给出标准化的评测指标。
4
+
5
+ ## 能力介绍
6
+
7
+ SWIFT的eval能力使用了魔搭社区[评测框架EvalScope](https://github.com/modelscope/eval-scope),并进行了高级封装以支持各类模型的评测需求。
8
+
9
+ > 注意:EvalScope支持许多其他的复杂能力,例如[模型的性能评测](https://evalscope.readthedocs.io/zh-cn/latest/user_guides/stress_test/quick_start.html),请直接使用EvalScope框架。
10
+
11
+ 目前我们支持了**标准评测集**的评测流程,以及**用户自定义**评测集的评测流程。其中**标准评测集**由三个评测后端提供支持:
12
+
13
+ 下面展示所支持的数据集名称,若需了解数据集的详细信息,请参考[所有支持的数据集](https://evalscope.readthedocs.io/zh-cn/latest/get_started/supported_dataset.html)
14
+
15
+ 1. Native(默认):
16
+
17
+ 主要支持纯文本评测,同时**支持**评测结果可视化
18
+ ```text
19
+ 'arc', 'bbh', 'ceval', 'cmmlu', 'competition_math',
20
+ 'general_qa', 'gpqa', 'gsm8k', 'hellaswag', 'humaneval',
21
+ 'ifeval', 'iquiz', 'mmlu', 'mmlu_pro',
22
+ 'race', 'trivia_qa', 'truthful_qa'
23
+ ```
24
+
25
+ 2. OpenCompass:
26
+
27
+ 主要支持纯文本评测,暂**不支持**评测结果可视化
28
+ ```text
29
+ 'obqa', 'cmb', 'AX_b', 'siqa', 'nq', 'mbpp', 'winogrande', 'mmlu', 'BoolQ', 'cluewsc', 'ocnli', 'lambada',
30
+ 'CMRC', 'ceval', 'csl', 'cmnli', 'bbh', 'ReCoRD', 'math', 'humaneval', 'eprstmt', 'WSC', 'storycloze',
31
+ 'MultiRC', 'RTE', 'chid', 'gsm8k', 'AX_g', 'bustm', 'afqmc', 'piqa', 'lcsts', 'strategyqa', 'Xsum', 'agieval',
32
+ 'ocnli_fc', 'C3', 'tnews', 'race', 'triviaqa', 'CB', 'WiC', 'hellaswag', 'summedits', 'GaokaoBench',
33
+ 'ARC_e', 'COPA', 'ARC_c', 'DRCD'
34
+ ```
35
+
36
+ 3. VLMEvalKit:
37
+
38
+ 主要支持多模态评测,暂**不支持**评测结果可视化
39
+ ```text
40
+ 'COCO_VAL', 'MME', 'HallusionBench', 'POPE', 'MMBench_DEV_EN', 'MMBench_TEST_EN', 'MMBench_DEV_CN', 'MMBench_TEST_CN',
41
+ 'MMBench', 'MMBench_CN', 'MMBench_DEV_EN_V11', 'MMBench_TEST_EN_V11', 'MMBench_DEV_CN_V11',
42
+ 'MMBench_TEST_CN_V11', 'MMBench_V11', 'MMBench_CN_V11', 'SEEDBench_IMG', 'SEEDBench2',
43
+ 'SEEDBench2_Plus', 'ScienceQA_VAL', 'ScienceQA_TEST', 'MMT-Bench_ALL_MI', 'MMT-Bench_ALL',
44
+ 'MMT-Bench_VAL_MI', 'MMT-Bench_VAL', 'AesBench_VAL', 'AesBench_TEST', 'CCBench', 'AI2D_TEST', 'MMStar',
45
+ 'RealWorldQA', 'MLLMGuard_DS', 'BLINK', 'OCRVQA_TEST', 'OCRVQA_TESTCORE', 'TextVQA_VAL', 'DocVQA_VAL',
46
+ 'DocVQA_TEST', 'InfoVQA_VAL', 'InfoVQA_TEST', 'ChartQA_TEST', 'MathVision', 'MathVision_MINI',
47
+ 'MMMU_DEV_VAL', 'MMMU_TEST', 'OCRBench', 'MathVista_MINI', 'LLaVABench', 'MMVet', 'MTVQA_TEST',
48
+ 'MMLongBench_DOC', 'VCR_EN_EASY_500', 'VCR_EN_EASY_100', 'VCR_EN_EASY_ALL', 'VCR_EN_HARD_500',
49
+ 'VCR_EN_HARD_100', 'VCR_EN_HARD_ALL', 'VCR_ZH_EASY_500', 'VCR_ZH_EASY_100', 'VCR_ZH_EASY_ALL',
50
+ 'VCR_ZH_HARD_500', 'VCR_ZH_HARD_100', 'VCR_ZH_HARD_ALL', 'MMDU', 'MMBench-Video', 'Video-MME'
51
+ ```
52
+
53
+ ## 环境准备
54
+
55
+ ```shell
56
+ pip install ms-swift[eval] -U
57
+ ```
58
+
59
+ 或从源代码安装:
60
+
61
+ ```shell
62
+ git clone https://github.com/modelscope/ms-swift.git
63
+ cd ms-swift
64
+ pip install -e '.[eval]'
65
+ ```
66
+
67
+ ## 评测
68
+
69
+ 支持纯文本评测、多模态评测、url评测、自定义数据集评测四种方式
70
+
71
+ **基本示例**
72
+
73
+ ```shell
74
+ CUDA_VISIBLE_DEVICES=0 \
75
+ swift eval \
76
+ --model Qwen/Qwen2.5-0.5B-Instruct \
77
+ --eval_backend Native \
78
+ --infer_backend pt \
79
+ --eval_limit 10 \
80
+ --eval_dataset gsm8k
81
+ ```
82
+ 其中:
83
+ - model: 可指定本地模型路径或者modelscope上的模型ID
84
+ - eval_backend: 可选 Native, OpenCompass, VLMEvalKit,默认为 Native
85
+ - infer_backend: 可选 pt, vllm, lmdeploy,默认为 pt
86
+ - eval_limit: 每个评测集的采样数,默认为None,表示使用全部数据,可用于快速验证
87
+ - eval_dataset: 评测数据集,可设置多个数据集,用空格分割
88
+
89
+ 具体评测的参数列表可以参考[这里](命令行参数.md#评测参数)。
90
+
91
+ ## 训练中评测
92
+
93
+ SWIFT支持在训练过程中使用EvalScope对当前的模型进行评测,以便及时了解模型的训练效果。
94
+
95
+ **基本示例**
96
+
97
+ ```shell
98
+ CUDA_VISIBLE_DEVICES=0 \
99
+ swift sft \
100
+ --model "Qwen/Qwen2.5-0.5B-Instruct" \
101
+ --train_type "lora" \
102
+ --dataset "AI-ModelScope/alpaca-gpt4-data-zh#100" \
103
+ --torch_dtype "bfloat16" \
104
+ --num_train_epochs "1" \
105
+ --per_device_train_batch_size "1" \
106
+ --learning_rate "1e-4" \
107
+ --lora_rank "8" \
108
+ --lora_alpha "32" \
109
+ --target_modules "all-linear" \
110
+ --gradient_accumulation_steps "16" \
111
+ --save_steps "50" \
112
+ --save_total_limit "5" \
113
+ --logging_steps "5" \
114
+ --max_length "2048" \
115
+ --eval_strategy "steps" \
116
+ --eval_steps "5" \
117
+ --per_device_eval_batch_size "5" \
118
+ --eval_use_evalscope \
119
+ --eval_datasets "gsm8k" \
120
+ --eval_datasets_args '{"gsm8k": {"few_shot_num": 0}}' \
121
+ --eval_limit "10"
122
+ ```
123
+
124
+ 注意启动命令为`sft`,其中eval相关的参数有:
125
+ - eval_strategy: 评估策略。默认为None,跟随`save_strategy`的策略
126
+ - eval_steps: 默认为None,如果存在评估数据集,则跟随`save_steps`的策略
127
+ - eval_use_evalscope: 是否使用evalscope进行评测,需要设置该参数来开启评测
128
+ - eval_datasets: 评测数据集,可设置多个数据集,用空格分割
129
+ - eval_datasets_args: 评测数据集参数,json格式,可设置多个数据集的参数
130
+ - eval_limit: 评测数据集采样数
131
+ - eval_generation_config: 评测时模型推理配置,json格式,默认为`{'max_tokens': 512}`
132
+
133
+
134
+ 更多评测的样例可以参考[examples](https://github.com/modelscope/ms-swift/tree/main/examples/eval)
135
+
136
+ ## 自定义评测集
137
+
138
+ 本框架支持选择题和问答题,两种预定义的数据集格式,使用流程如下:
139
+
140
+ *注意:使用自定义评测时,eval_backend参数必须为Native*
141
+
142
+ ### 选择题格式(MCQ)
143
+ 适合用户是选择题的场景,评测指标为准确率(accuracy)。
144
+
145
+ **数据准备**
146
+
147
+ 准备选择题格式的csv文件,该目录结构如下:
148
+
149
+ ```text
150
+ mcq/
151
+ ├── example_dev.csv # (可选)文件名组成为`{subset_name}_dev.csv`,用于fewshot评测
152
+ └── example_val.csv # 文件名组成为`{subset_name}_val.csv`,用于实际评测的数据
153
+ ```
154
+
155
+ 其中csv文件需要为下面的格式:
156
+
157
+ ```text
158
+ id,question,A,B,C,D,answer
159
+ 1,通常来说,组成动物蛋白质的氨基酸有____,4种,22种,20种,19种,C
160
+ 2,血液内存在的下列物质中,不属于代谢终产物的是____。,尿素,尿酸,丙酮酸,二氧化碳,C
161
+ ```
162
+ 其中:
163
+ - `id`是序号(可选)
164
+ - `question`是问题
165
+ - `A`, `B`, `C`, `D`等是可选项,最大支持10个选项
166
+ - `answer`是正确选项
167
+
168
+ **启动评测**
169
+
170
+ 运行下面的命令:
171
+
172
+ ```bash
173
+ CUDA_VISIBLE_DEVICES=0 \
174
+ swift eval \
175
+ --model Qwen/Qwen2.5-0.5B-Instruct \
176
+ --eval_backend Native \
177
+ --infer_backend pt \
178
+ --eval_dataset general_mcq \
179
+ --dataset_args '{"general_mcq": {"local_path": "/path/to/mcq", "subset_list": ["example"]}}'
180
+ ```
181
+ 其中:
182
+ - `eval_dataset` 需要设置为 `general_mcq`
183
+ - `dataset_args` 需要设置
184
+ - `local_path` 自定义数据集文件夹路径
185
+ - `subset_list` 评测数据集名称,上述 `*_dev.csv` 中的 `*`
186
+
187
+ **运行结果**
188
+
189
+ ```text
190
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
191
+ | Model | Dataset | Metric | Subset | Num | Score | Cat.0 |
192
+ +=====================+=============+=================+==========+=======+=========+=========+
193
+ | Qwen2-0.5B-Instruct | general_mcq | AverageAccuracy | example | 12 | 0.5833 | default |
194
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
195
+ ```
196
+
197
+ ## 问答题格式(QA)
198
+ 适合用户是问答题的场景,评测指标是`ROUGE`和`BLEU`。
199
+
200
+ **数据准备**
201
+
202
+ 准备一个问答题格式的jsonline文件,该目录包含了一个文件:
203
+
204
+ ```text
205
+ qa/
206
+ └── example.jsonl
207
+ ```
208
+
209
+ 该jsonline文件需要为下面的格式:
210
+
211
+ ```json
212
+ {"query": "中国的首都是哪里?", "response": "中国的首都是北京"}
213
+ {"query": "世界上最高的山是哪座山?", "response": "是珠穆朗玛峰"}
214
+ {"query": "为什么北极见不到企鹅?", "response": "因为企鹅大多生活在南极"}
215
+ ```
216
+
217
+ **启动评测**
218
+
219
+ 运行下面的命令:
220
+
221
+ ```bash
222
+ CUDA_VISIBLE_DEVICES=0 \
223
+ swift eval \
224
+ --model Qwen/Qwen2.5-0.5B-Instruct \
225
+ --eval_backend Native \
226
+ --infer_backend pt \
227
+ --eval_dataset general_qa \
228
+ --dataset_args '{"general_qa": {"local_path": "/path/to/qa", "subset_list": ["example"]}}'
229
+ ```
230
+
231
+ 其中:
232
+ - `eval_dataset` 需要设置为 `general_qa`
233
+ - `dataset_args` 是一个json字符串,需要设置:
234
+ - `local_path` 自定义数据集文件夹路径
235
+ - `subset_list` 评测数据集名称,上述 `*.jsonl` 中的 `*`
236
+
237
+ **运行结果**
238
+
239
+ ```text
240
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
241
+ | Model | Dataset | Metric | Subset | Num | Score | Cat.0 |
242
+ +=====================+=============+=================+==========+=======+=========+=========+
243
+ | Qwen2-0.5B-Instruct | general_qa | bleu-1 | default | 12 | 0.2324 | default |
244
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
245
+ | Qwen2-0.5B-Instruct | general_qa | bleu-2 | default | 12 | 0.1451 | default |
246
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
247
+ | Qwen2-0.5B-Instruct | general_qa | bleu-3 | default | 12 | 0.0625 | default |
248
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
249
+ | Qwen2-0.5B-Instruct | general_qa | bleu-4 | default | 12 | 0.0556 | default |
250
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
251
+ | Qwen2-0.5B-Instruct | general_qa | rouge-1-f | default | 12 | 0.3441 | default |
252
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
253
+ | Qwen2-0.5B-Instruct | general_qa | rouge-1-p | default | 12 | 0.2393 | default |
254
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
255
+ | Qwen2-0.5B-Instruct | general_qa | rouge-1-r | default | 12 | 0.8889 | default |
256
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
257
+ | Qwen2-0.5B-Instruct | general_qa | rouge-2-f | default | 12 | 0.2062 | default |
258
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
259
+ | Qwen2-0.5B-Instruct | general_qa | rouge-2-p | default | 12 | 0.1453 | default |
260
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
261
+ | Qwen2-0.5B-Instruct | general_qa | rouge-2-r | default | 12 | 0.6167 | default |
262
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
263
+ | Qwen2-0.5B-Instruct | general_qa | rouge-l-f | default | 12 | 0.333 | default |
264
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
265
+ | Qwen2-0.5B-Instruct | general_qa | rouge-l-p | default | 12 | 0.2324 | default |
266
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
267
+ | Qwen2-0.5B-Instruct | general_qa | rouge-l-r | default | 12 | 0.8889 | default |
268
+ +---------------------+-------------+-----------------+----------+-------+---------+---------+
269
+ ```
docs/source_en/BestPractices/Rapidly-Training-VL-model.md ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Best Practices for Rapidly Training Vision-Language (VL) Models
2
+
3
+ This document provides best practices for quickly training vision-language (VL) models from scratch.
4
+
5
+ Model Links
6
+ - [Qwen2.5-VL-7B-Instruct](https://www.modelscope.cn/models/Qwen/Qwen2.5-VL-7B-Instruct)
7
+ - [Qwen3-8B](https://www.modelscope.cn/models/Qwen/Qwen3-8B)
8
+
9
+ Trained Model Link
10
+ - [Simple-VL-8B](https://www.modelscope.cn/models/swift/Simple-VL-8B/summary)
11
+
12
+
13
+ The training workflow builds upon the Qwen2.5-VL-7B-Instruct model architecture by replacing its internal large language model (LLM) component with the weights from Qwen3-8B , thereby enhancing the model's visual understanding capabilities. The process involves the following steps:
14
+
15
+ 1. Modify the original model’s configuration file config.json to align with Qwen3-8B.
16
+ 2. Initialize and load new model weights, saving them as a new model.
17
+ 3. Fine-tune the new model in two stages:
18
+ 1. Stage 1 : Train only the vision-to-language alignment module (aligner), freezing the ViT and LLM components.
19
+ 2. Stage 2 : Unfreeze all modules and perform joint fine-tuning to improve overall performance.
20
+
21
+
22
+ ## Model Modification
23
+
24
+ ### Config File (config.json) Update
25
+ Due to structural differences between Qwen2.5-7B-Instruct and Qwen3-8B (e.g., number of layers, hidden dimensions), create a new config.json based on the Qwen2.5-VL-7B-Instruct config and update the following parameters to match Qwen3-8B:
26
+
27
+
28
+ ```
29
+ Modified Parameters
30
+ 1. hidden_size 3584->4096
31
+ 2. intermediate_size: 18944->12288
32
+ 3. num_attention_heads: 28->32
33
+ 4. num_key_value_heads: 4->8
34
+ 5. num_hidden_layers: 28->32
35
+ 6. vocab_size:152064->151936
36
+ 7. max_window_layers:28->36
37
+
38
+ Newly Added Parameter
39
+ 1. head_dim: 128
40
+ ```
41
+
42
+ ### Model Weight Initialization and Replacement
43
+ Use the following Python script to initialize, replace, and save the model weights:
44
+ ```python
45
+ import torch
46
+ from modelscope import Qwen2_5_VLForConditionalGeneration, AutoModelForCausalLM, AutoConfig
47
+ from transformers.models.qwen2_5_vl.modeling_qwen2_5_vl import Qwen2_5_VLPatchMerger, Qwen2_5_VLModel
48
+ from accelerate import Accelerator
49
+
50
+ # Load original VL model and Qwen3-8B model
51
+ qwen2_5_vl_7b_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
52
+ "Qwen/Qwen2.5-VL-7B-Instruct",
53
+ device_map="cuda",
54
+ torch_dtype=torch.bfloat16
55
+ )
56
+ device = qwen2_5_vl_7b_model.device
57
+
58
+ qwen3_8b_model = AutoModelForCausalLM.from_pretrained(
59
+ "Qwen/Qwen3-8B",
60
+ device_map=device,
61
+ torch_dtype=torch.bfloat16
62
+ )
63
+
64
+ # Load configurations
65
+ old_config = AutoConfig.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
66
+ new_config = AutoConfig.from_pretrained("/path/to/new_config_dir") # Path to new config directory
67
+
68
+ # Replace merger (aligner) layer
69
+ new_merger = Qwen2_5_VLPatchMerger(
70
+ dim=new_visual_config.out_hidden_size,
71
+ context_dim=new_visual_config.hidden_size,
72
+ spatial_merge_size=new_visual_config.spatial_merge_size,
73
+ ).to(device).to(torch.bfloat16)
74
+ qwen2_5_vl_7b_model.visual.merger = new_merger
75
+
76
+ # Replace LLM part of the VL model
77
+ new_llm_model = Qwen2_5_VLModel(new_config).to(device).to(torch.bfloat16)
78
+
79
+ for name, param in qwen3_8b_model.model.named_parameters():
80
+ if name in new_llm_model.state_dict():
81
+ new_llm_model.state_dict()[name].copy_(param)
82
+
83
+ qwen2_5_vl_7b_model.model = new_llm_model
84
+ qwen2_5_vl_7b_model.lm_head = qwen3_8b_model.lm_head
85
+
86
+ # Save modified model
87
+ accelerator = Accelerator()
88
+ accelerator.save_model(
89
+ model=qwen2_5_vl_7b_model,
90
+ save_directory="/path/to/save/Qwen3-VL-Model",
91
+ max_shard_size="4GB",
92
+ safe_serialization=True
93
+ )
94
+ ```
95
+
96
+
97
+ ## Training
98
+ To simplify the process, we skip pre-training and proceed directly to supervised fine-tuning (SFT). The training is divided into two stages:
99
+
100
+ ### Stage 1: Train Aligner Layer
101
+ Train only the vision-to-language alignment module while freezing the ViT and LLM parts:
102
+ ```bash
103
+ NNODES=$WORLD_SIZE \
104
+ NODE_RANK=$RANK \
105
+ NPROC_PER_NODE=8 \
106
+ MAX_PIXELS=1003520 \
107
+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
108
+ swift sft \
109
+ --model /path/to/new_vl_model \
110
+ --model_type qwen2_5_vl \
111
+ --train_type full \
112
+ --dataset xxx \
113
+ --torch_dtype bfloat16 \
114
+ --attn_impl flash_attn \
115
+ --freeze_vit true \
116
+ --freeze_llm true \
117
+ --freeze_aligner false \
118
+ --num_train_epochs 3 \
119
+ --per_device_train_batch_size 2 \
120
+ --learning_rate 5e-6 \
121
+ --gradient_accumulation_steps 8 \
122
+ --eval_steps -1 \
123
+ --save_steps 1000 \
124
+ --save_total_limit 10 \
125
+ --logging_steps 5 \
126
+ --max_length 8192 \
127
+ --output_dir output \
128
+ --warmup_ratio 0.05 \
129
+ --dataloader_num_workers 4 \
130
+ --dataset_num_proc 8 \
131
+ --deepspeed zero2
132
+ ```
133
+
134
+ ### Stage 2: Full Model Training
135
+
136
+ Unfreeze all modules and jointly train to enhance the model's visual understanding:
137
+
138
+ ```bash
139
+ NNODES=$WORLD_SIZE \
140
+ NODE_RANK=$RANK \
141
+ NPROC_PER_NODE=8 \
142
+ MAX_PIXELS=1003520 \
143
+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
144
+ swift sft \
145
+ --model /path/to/stage1_checkpoint \
146
+ --model_type qwen2_5_vl \
147
+ --train_type full \
148
+ --dataset xxx \
149
+ --torch_dtype bfloat16 \
150
+ --attn_impl flash_attn \
151
+ --freeze_vit false \
152
+ --freeze_llm false \
153
+ --freeze_aligner false \
154
+ --num_train_epochs 3 \
155
+ --per_device_train_batch_size 2 \
156
+ --learning_rate 5e-6 \
157
+ --gradient_accumulation_steps 8 \
158
+ --eval_steps -1 \
159
+ --save_steps 1000 \
160
+ --save_total_limit 10 \
161
+ --logging_steps 5 \
162
+ --max_length 8192 \
163
+ --output_dir output \
164
+ --warmup_ratio 0.05 \
165
+ --dataloader_num_workers 4 \
166
+ --dataset_num_proc 8 \
167
+ --deepspeed zero2
168
+ ```
169
+
170
+ ## Inference / Deployment / Evaluation
171
+
172
+ ### Inference
173
+ Perform inference using `swift infer`:
174
+ ```bash
175
+ swift infer \
176
+ --model /path/to/stage2_checkpoint
177
+ ```
178
+
179
+ ### Deoloyment
180
+ Accelerate model serving with vLLM:
181
+ ```bash
182
+ CUDA_VISIBLE_DEVICES=0 \
183
+ MAX_PIXELS=1003520 \
184
+ VIDEO_MAX_PIXELS=50176 \
185
+ FPS_MAX_FRAMES=12 \
186
+ swift deploy \
187
+ --model /path/to/stage2_checkpoint \
188
+ --infer_backend vllm \
189
+ --gpu_memory_utilization 0.9 \
190
+ --max_model_len 8192 \
191
+ --max_new_tokens 2048 \
192
+ --limit_mm_per_prompt '{"image": 5, "video": 2}' \
193
+ --served_model_name Qwen3-VL
194
+ ```
195
+
196
+ ### Evaluation
197
+ Evaluate the trained VL model using [EvalScope](https://github.com/modelscope/evalscope/).
198
+
199
+ Example Evaluation Using MMMU Benchmark
200
+ ```python
201
+ from evalscope import TaskConfig, run_task
202
+
203
+ task_cfg_dict = TaskConfig(
204
+ work_dir='outputs',
205
+ eval_backend='VLMEvalKit',
206
+ eval_config={
207
+ 'data': ['MMMU_DEV_VAL'],
208
+ 'mode': 'all',
209
+ 'model': [
210
+ {
211
+ 'api_base': 'http://localhost:8000/v1/chat/completions',
212
+ 'key': 'EMPTY',
213
+ 'name': 'CustomAPIModel',
214
+ 'temperature': 0.6,
215
+ 'type': 'Qwen3-VL',
216
+ 'img_size': -1,
217
+ 'video_llm': False,
218
+ 'max_tokens': 512,
219
+ }
220
+ ],
221
+ 'reuse': False,
222
+ 'nproc': 64,
223
+ 'judge': 'exact_matching'
224
+ },
225
+ )
226
+
227
+ run_task(task_cfg=task_cfg_dict)
228
+ ```
docs/source_en/Customization/Custom-model.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Custom Model
2
+
3
+ The models built into ms-swift can be used directly by specifying either `model_id` or `model_path`: `--model <model_id_or_path>`. ms-swift determines the `model_type` based on the suffix of `model_id/model_path` and the `config.json` file. Each `model_type` has a unique model structure, template, and loading method. Of course, you can also manually override these by passing `--model_type` and `--template`. You can check the supported `model_type` and templates in the [Supported Models and Datasets](../Instruction/Supported-models-and-datasets.md).
4
+
5
+ ## Model Registration
6
+
7
+ Custom models are typically implemented using model registration. You can refer to the [built-in model](https://github.com/modelscope/ms-swift/blob/main/swift/llm/model/model/qwen.py), the [built-in dialogue template](https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/qwen.py), or the example code in the [examples](https://github.com/modelscope/swift/blob/main/examples/custom). You can specify the `--custom_register_path xxx.py` to parse the externally registered content, which is convenient for users installing via pip instead of git clone.
8
+
9
+ The `register_model` function registers a model in the `MODEL_MAPPING`. You can complete the model registration by calling the function `register_model(model_meta)`, where `model_meta` will store the model's metadata. The parameter list for ModelMeta is as follows:
10
+
11
+ - model_type: Required. The model type, which is also the unique ID.
12
+ - model_groups: Required. Lists the ModelScope/HuggingFace model IDs and local paths. Running the [run_model_info.py](https://github.com/modelscope/ms-swift/blob/main/scripts/utils/run_model_info.py) file will automatically generate the [supported models documentation](https://swift.readthedocs.io/en/latest/Instruction/Supported-models-and-datasets.html) and automatically match the model_type based on the `--model` suffix.
13
+ - template: Required. The default template type when `--template` is not specified.
14
+ - get_function: Required. The loading function for the model and tokenizer/processor (for multi-modal models). LLM is typically set to `get_model_tokenizer_with_flash_attn`.
15
+ - model_arch: The model architecture. Defaults to None. Multi-modal model training requires setting this parameter to determine the prefix for llm/vit/aligner.
16
+ - architectures: The architectures item in config.json, used to automatically match the model with its model_type. Defaults to `[]`.
17
+ - additional_saved_files: Files that need to be additionally saved during full parameter training and merge-lora. Defaults to `[]`.
18
+ - torch_dtype: The default dtype when `torch_dtype` is not passed during model loading. Defaults to None, read from config.json.
19
+ - is_multimodal: Indicates whether the model is multi-modal. Defaults to False.
20
+ - ignore_patterns: File patterns to be ignored when downloading from the hub. Defaults to `[]`.
21
+
22
+ The `register_template` function registers a dialogue template in `TEMPLATE_MAPPING`. To complete the registration of the dialogue template, simply call the function `register_template(template_meta)`, where `template_meta` will store the metadata of the template. The parameter list for TemplateMeta is as follows:
23
+
24
+ - template_type: Required. The type of dialogue template, which also serves as a unique ID.
25
+ - prefix: Required. The prefix of the dialogue template, usually encompassing parts like system, bos_token, and is generated independently of multi-turn dialogue loops. For example, the prefix for qwen is `[]`.
26
+ - prompt: Required. Represents the dialogue portion before `{{RESPONSE}}`. We use `{{QUERY}}` as a placeholder for the user's inquiry part. For example, the prompt for qwen is `['<|im_start|>user\n{{QUERY}}<|im_end|>\n<|im_start|>assistant\n']`.
27
+ - chat_sep: Required. The separator for each turn in multi-turn dialogues. If set to None, the template does not support multi-turn dialogue. For example, the chat_sep for qwen is `['<|im_end|>\n']`.
28
+ - suffix: Defaults to `[['eos_token_id']]`. The suffix part of the dialogue template, generated independently of multi-turn dialogue loops, usually the eos_token. For example, the suffix for qwen is `['<|im_end|>']`.
29
+ - template_cls: Defaults to `Template`. Customization is generally required when defining templates for multimodal models, particularly in customizing the `_encode`, `_post_encode`, and `_data_collator` functions.
30
+ - system_prefix: Defaults to None. The prefix for dialogue templates with a system. We use`{{SYSTEM}}`as a placeholder for the system. For example, the system_prefix for qwen is`['<|im_start|>system\n{{SYSTEM}}<|im_end|>\n']`.
31
+ - Note: If the system is empty and `prefix` can be replaced by `system_prefix`, you can write `prefix` as a prefix including the system without setting `system_prefix`.
32
+ - If the prefix does not include `{{SYSTEM}}` and system_prefix is not set, the template does not support the system.
33
+ - default_system: Defaults to None. The default system used when `--system` is not provided. For example, the default_system for qwen is `'You are a helpful assistant.'`.
34
+ - stop_words: Defaults to`[]`. Additional stop words besides eos_token and`suffix[-1]`. For example, the stop_words for qwen is`['<|endoftext|>']`
35
+ - Note: During inference, the output response will be filtered by eos_token and `suffix[-1]`, but additional stop_words will be retained.
docs/transformers/.github/ISSUE_TEMPLATE/bug-report.yml ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "\U0001F41B Bug Report"
2
+ description: Submit a bug report to help us improve transformers
3
+ labels: [ "bug" ]
4
+ body:
5
+ - type: markdown
6
+ attributes:
7
+ value: |
8
+ Thanks for taking the time to fill out this bug report! 🤗
9
+
10
+ Before you submit your bug report:
11
+
12
+ - If it is your first time submitting, be sure to check our [bug report guidelines](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#did-you-find-a-bug)
13
+ - Try our [docs bot](https://huggingface.co/spaces/huggingchat/hf-docs-chat) -- it might be able to help you with your issue
14
+
15
+ - type: textarea
16
+ id: system-info
17
+ attributes:
18
+ label: System Info
19
+ description: Please share your system info with us. You can run the command `transformers-cli env` and copy-paste its output below.
20
+ placeholder: transformers version, platform, python version, ...
21
+ validations:
22
+ required: true
23
+
24
+ - type: textarea
25
+ id: who-can-help
26
+ attributes:
27
+ label: Who can help?
28
+ description: |
29
+ Your issue will be replied to more quickly if you can figure out the right person to tag with @
30
+ If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
31
+
32
+ All issues are read by one of the core maintainers, so if you don't know who to tag, just leave this blank and
33
+ a core maintainer will ping the right person.
34
+
35
+ Please tag fewer than 3 people.
36
+
37
+ Models:
38
+
39
+ - text models: @ArthurZucker
40
+ - vision models: @amyeroberts, @qubvel
41
+ - speech models: @eustlb
42
+ - graph models: @clefourrier
43
+
44
+ Library:
45
+
46
+ - flax: @gante and @Rocketknight1
47
+ - generate: @zucchini-nlp (visual-language models) or @gante (all others)
48
+ - pipelines: @Rocketknight1
49
+ - tensorflow: @gante and @Rocketknight1
50
+ - tokenizers: @ArthurZucker and @itazap
51
+ - trainer: @zach-huggingface @SunMarc
52
+
53
+ Integrations:
54
+
55
+ - deepspeed: HF Trainer/Accelerate: @SunMarc @zach-huggingface
56
+ - ray/raytune: @richardliaw, @amogkam
57
+ - Big Model Inference: @SunMarc
58
+ - quantization (bitsandbytes, autogpt): @SunMarc @MekkCyber
59
+
60
+ Devices/Backends:
61
+
62
+ - AMD ROCm: @ivarflakstad
63
+ - Intel XPU: @IlyasMoutawwakil
64
+ - Ascend NPU: @ivarflakstad
65
+
66
+ Documentation: @stevhliu
67
+
68
+ Model hub:
69
+
70
+ - for issues with a model, report at https://discuss.huggingface.co/ and tag the model's creator.
71
+
72
+ HF projects:
73
+
74
+ - accelerate: [different repo](https://github.com/huggingface/accelerate)
75
+ - datasets: [different repo](https://github.com/huggingface/datasets)
76
+ - diffusers: [different repo](https://github.com/huggingface/diffusers)
77
+ - rust tokenizers: [different repo](https://github.com/huggingface/tokenizers)
78
+
79
+ Maintained examples (not research project or legacy):
80
+
81
+ - Flax: @Rocketknight1
82
+ - PyTorch: See Models above and tag the person corresponding to the modality of the example.
83
+ - TensorFlow: @Rocketknight1
84
+
85
+ Research projects are not maintained and should be taken as is.
86
+
87
+ placeholder: "@Username ..."
88
+
89
+ - type: checkboxes
90
+ id: information-scripts-examples
91
+ attributes:
92
+ label: Information
93
+ description: 'The problem arises when using:'
94
+ options:
95
+ - label: "The official example scripts"
96
+ - label: "My own modified scripts"
97
+
98
+ - type: checkboxes
99
+ id: information-tasks
100
+ attributes:
101
+ label: Tasks
102
+ description: "The tasks I am working on are:"
103
+ options:
104
+ - label: "An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)"
105
+ - label: "My own task or dataset (give details below)"
106
+
107
+ - type: textarea
108
+ id: reproduction
109
+ validations:
110
+ required: true
111
+ attributes:
112
+ label: Reproduction
113
+ description: |
114
+ Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
115
+ Please include relevant config information with your code, for example your Trainers, TRL, Peft, and DeepSpeed configs.
116
+ If you have code snippets, error messages, stack traces please provide them here as well.
117
+ Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
118
+ Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
119
+
120
+ placeholder: |
121
+ Steps to reproduce the behavior:
122
+
123
+ 1.
124
+ 2.
125
+ 3.
126
+
127
+
128
+ - type: textarea
129
+ id: expected-behavior
130
+ validations:
131
+ required: true
132
+ attributes:
133
+ label: Expected behavior
134
+ description: "A clear and concise description of what you would expect to happen."
docs/transformers/.github/ISSUE_TEMPLATE/config.yml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ blank_issues_enabled: true
2
+ version: 2.1
3
+ contact_links:
4
+ - name: Model checkpoints on the Hugging Face Hub
5
+ url: https://huggingface.co/models
6
+ about: Open a Pull request / Discussion related to a specific model checkpoint directly on the Hugging Face Hub
7
+ - name: Website Related
8
+ url: https://github.com/huggingface/hub-docs/issues
9
+ about: Feature requests and bug reports related to the website
10
+ - name: Forum
11
+ url: https://discuss.huggingface.co/
12
+ about: General usage questions and community discussions
docs/transformers/.github/ISSUE_TEMPLATE/migration.yml ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: "\U0001F4DA Migration from pytorch-pretrained-bert or pytorch-transformers"
2
+ description: Report a problem when migrating from pytorch-pretrained-bert or pytorch-transformers to transformers
3
+ labels: [ "migration" ]
4
+ body:
5
+ - type: textarea
6
+ id: system-info
7
+ attributes:
8
+ label: System Info
9
+ description: Please share your system info with us. You can run the command `transformers-cli env` and copy-paste its output below.
10
+ render: shell
11
+ placeholder: transformers version, platform, python version, ...
12
+ validations:
13
+ required: true
14
+
15
+ - type: checkboxes
16
+ id: information-scripts-examples
17
+ attributes:
18
+ label: Information
19
+ description: 'The problem arises when using:'
20
+ options:
21
+ - label: "The official example scripts"
22
+ - label: "My own modified scripts"
23
+
24
+ - type: checkboxes
25
+ id: information-tasks
26
+ attributes:
27
+ label: Tasks
28
+ description: "The tasks I am working on are:"
29
+ options:
30
+ - label: "An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)"
31
+ - label: "My own task or dataset (give details below)"
32
+
33
+ - type: textarea
34
+ id: reproduction
35
+ validations:
36
+ required: true
37
+ attributes:
38
+ label: Reproduction
39
+ description: |
40
+ Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
41
+ If you have code snippets, error messages, stack traces please provide them here as well.
42
+ Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
43
+ Do not use screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
44
+
45
+ placeholder: |
46
+ Steps to reproduce the behavior:
47
+
48
+ 1.
49
+ 2.
50
+ 3.
51
+
52
+
53
+ - type: textarea
54
+ id: expected-behavior
55
+ validations:
56
+ required: true
57
+ attributes:
58
+ label: Expected behavior
59
+ description: "A clear and concise description of what you would expect to happen."
60
+ render: shell
61
+
62
+ - type: checkboxes
63
+ id: checklist
64
+ attributes:
65
+ label: Checklist
66
+ options:
67
+ - label: "I have read the migration guide in the readme.
68
+ ([pytorch-transformers](https://github.com/huggingface/transformers#migrating-from-pytorch-transformers-to-transformers);
69
+ [pytorch-pretrained-bert](https://github.com/huggingface/transformers#migrating-from-pytorch-pretrained-bert-to-transformers))"
70
+ required: true
71
+ - label: "I checked if a related official extension example runs on my machine."
72
+ required: true
docs/transformers/.github/conda/build.sh ADDED
@@ -0,0 +1 @@
 
 
1
+ $PYTHON setup.py install # Python command to install the script.
docs/transformers/.github/conda/meta.yaml ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% set name = "transformers" %}
2
+
3
+ package:
4
+ name: "{{ name|lower }}"
5
+ version: "{{ TRANSFORMERS_VERSION }}"
6
+
7
+ source:
8
+ path: ../../
9
+
10
+ build:
11
+ noarch: python
12
+
13
+ requirements:
14
+ host:
15
+ - python
16
+ - pip
17
+ - numpy >=1.17
18
+ - dataclasses
19
+ - huggingface_hub
20
+ - packaging
21
+ - filelock
22
+ - requests
23
+ - tqdm >=4.27
24
+ - sacremoses
25
+ - regex !=2019.12.17
26
+ - protobuf
27
+ - tokenizers >=0.11.1,!=0.11.3,<0.13
28
+ - pyyaml >=5.1
29
+ - safetensors
30
+ - fsspec
31
+ run:
32
+ - python
33
+ - numpy >=1.17
34
+ - dataclasses
35
+ - huggingface_hub
36
+ - packaging
37
+ - filelock
38
+ - requests
39
+ - tqdm >=4.27
40
+ - sacremoses
41
+ - regex !=2019.12.17
42
+ - protobuf
43
+ - tokenizers >=0.11.1,!=0.11.3,<0.13
44
+ - pyyaml >=5.1
45
+ - safetensors
46
+ - fsspec
47
+
48
+ test:
49
+ imports:
50
+ - transformers
51
+
52
+ about:
53
+ home: https://huggingface.co
54
+ license: Apache License 2.0
55
+ license_file: LICENSE
56
+ summary: "🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0."
docs/transformers/.github/workflows/assign-reviewers.yml ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Assign PR Reviewers
2
+ on:
3
+ pull_request_target:
4
+ branches:
5
+ - main
6
+ types: [ready_for_review]
7
+
8
+ jobs:
9
+ assign_reviewers:
10
+ permissions:
11
+ pull-requests: write
12
+ runs-on: ubuntu-22.04
13
+ steps:
14
+ - uses: actions/checkout@v4
15
+ - name: Set up Python
16
+ uses: actions/setup-python@v5
17
+ with:
18
+ python-version: '3.13'
19
+ - name: Install dependencies
20
+ run: |
21
+ python -m pip install --upgrade pip
22
+ pip install PyGithub
23
+ - name: Run assignment script
24
+ env:
25
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
26
+ run: python .github/scripts/assign_reviewers.py
docs/transformers/.github/workflows/build-docker-images.yml ADDED
@@ -0,0 +1,393 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Build docker images (scheduled)
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - build_ci_docker_image*
7
+ repository_dispatch:
8
+ workflow_call:
9
+ inputs:
10
+ image_postfix:
11
+ required: true
12
+ type: string
13
+ schedule:
14
+ - cron: "17 0 * * *"
15
+
16
+ concurrency:
17
+ group: docker-images-builds
18
+ cancel-in-progress: false
19
+
20
+ jobs:
21
+ latest-docker:
22
+ name: "Latest PyTorch + TensorFlow [dev]"
23
+ runs-on:
24
+ group: aws-general-8-plus
25
+ steps:
26
+ -
27
+ name: Set up Docker Buildx
28
+ uses: docker/setup-buildx-action@v3
29
+ -
30
+ name: Check out code
31
+ uses: actions/checkout@v4
32
+ -
33
+ name: Login to DockerHub
34
+ uses: docker/login-action@v3
35
+ with:
36
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
37
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
38
+ -
39
+ name: Build and push
40
+ uses: docker/build-push-action@v5
41
+ with:
42
+ context: ./docker/transformers-all-latest-gpu
43
+ build-args: |
44
+ REF=main
45
+ push: true
46
+ tags: huggingface/transformers-all-latest-gpu${{ inputs.image_postfix }}
47
+ # Push CI images still need to be re-built daily
48
+ -
49
+ name: Build and push (for Push CI) in a daily basis
50
+ # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
51
+ # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
52
+ if: inputs.image_postfix != '-push-ci'
53
+ uses: docker/build-push-action@v5
54
+ with:
55
+ context: ./docker/transformers-all-latest-gpu
56
+ build-args: |
57
+ REF=main
58
+ push: true
59
+ tags: huggingface/transformers-all-latest-gpu-push-ci
60
+
61
+ - name: Post to Slack
62
+ if: always()
63
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
64
+ with:
65
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
66
+ title: 🤗 Results of the transformers-all-latest-gpu-push-ci docker build
67
+ status: ${{ job.status }}
68
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
69
+
70
+ latest-torch-deepspeed-docker:
71
+ name: "Latest PyTorch + DeepSpeed"
72
+ runs-on:
73
+ group: aws-g4dn-2xlarge-cache
74
+ steps:
75
+ -
76
+ name: Set up Docker Buildx
77
+ uses: docker/setup-buildx-action@v3
78
+ -
79
+ name: Check out code
80
+ uses: actions/checkout@v4
81
+ -
82
+ name: Login to DockerHub
83
+ uses: docker/login-action@v3
84
+ with:
85
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
86
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
87
+ -
88
+ name: Build and push
89
+ uses: docker/build-push-action@v5
90
+ with:
91
+ context: ./docker/transformers-pytorch-deepspeed-latest-gpu
92
+ build-args: |
93
+ REF=main
94
+ push: true
95
+ tags: huggingface/transformers-pytorch-deepspeed-latest-gpu${{ inputs.image_postfix }}
96
+
97
+ - name: Post to Slack
98
+ if: always()
99
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
100
+ with:
101
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER}}
102
+ title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu docker build
103
+ status: ${{ job.status }}
104
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
105
+
106
+ # Can't build 2 images in a single job `latest-torch-deepspeed-docker` (for `nvcr.io/nvidia`)
107
+ latest-torch-deepspeed-docker-for-push-ci-daily-build:
108
+ name: "Latest PyTorch + DeepSpeed (Push CI - Daily Build)"
109
+ runs-on:
110
+ group: aws-general-8-plus
111
+ steps:
112
+ -
113
+ name: Set up Docker Buildx
114
+ uses: docker/setup-buildx-action@v3
115
+ -
116
+ name: Check out code
117
+ uses: actions/checkout@v4
118
+ -
119
+ name: Login to DockerHub
120
+ uses: docker/login-action@v3
121
+ with:
122
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
123
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
124
+ # Push CI images still need to be re-built daily
125
+ -
126
+ name: Build and push (for Push CI) in a daily basis
127
+ # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
128
+ # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
129
+ if: inputs.image_postfix != '-push-ci'
130
+ uses: docker/build-push-action@v5
131
+ with:
132
+ context: ./docker/transformers-pytorch-deepspeed-latest-gpu
133
+ build-args: |
134
+ REF=main
135
+ push: true
136
+ tags: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci
137
+
138
+ - name: Post to Slack
139
+ if: always()
140
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
141
+ with:
142
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
143
+ title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu-push-ci docker build
144
+ status: ${{ job.status }}
145
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
146
+
147
+ doc-builder:
148
+ name: "Doc builder"
149
+ # Push CI doesn't need this image
150
+ if: inputs.image_postfix != '-push-ci'
151
+ runs-on:
152
+ group: aws-general-8-plus
153
+ steps:
154
+ -
155
+ name: Set up Docker Buildx
156
+ uses: docker/setup-buildx-action@v3
157
+ -
158
+ name: Check out code
159
+ uses: actions/checkout@v4
160
+ -
161
+ name: Login to DockerHub
162
+ uses: docker/login-action@v3
163
+ with:
164
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
165
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
166
+ -
167
+ name: Build and push
168
+ uses: docker/build-push-action@v5
169
+ with:
170
+ context: ./docker/transformers-doc-builder
171
+ push: true
172
+ tags: huggingface/transformers-doc-builder
173
+
174
+ - name: Post to Slack
175
+ if: always()
176
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
177
+ with:
178
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
179
+ title: 🤗 Results of the huggingface/transformers-doc-builder docker build
180
+ status: ${{ job.status }}
181
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
182
+
183
+ latest-pytorch:
184
+ name: "Latest PyTorch [dev]"
185
+ # Push CI doesn't need this image
186
+ if: inputs.image_postfix != '-push-ci'
187
+ runs-on:
188
+ group: aws-general-8-plus
189
+ steps:
190
+ -
191
+ name: Set up Docker Buildx
192
+ uses: docker/setup-buildx-action@v3
193
+ -
194
+ name: Check out code
195
+ uses: actions/checkout@v4
196
+ -
197
+ name: Login to DockerHub
198
+ uses: docker/login-action@v3
199
+ with:
200
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
201
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
202
+ -
203
+ name: Build and push
204
+ uses: docker/build-push-action@v5
205
+ with:
206
+ context: ./docker/transformers-pytorch-gpu
207
+ build-args: |
208
+ REF=main
209
+ push: true
210
+ tags: huggingface/transformers-pytorch-gpu
211
+
212
+ - name: Post to Slack
213
+ if: always()
214
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
215
+ with:
216
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
217
+ title: 🤗 Results of the huggingface/transformers-pytorch-gpudocker build
218
+ status: ${{ job.status }}
219
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
220
+
221
+ latest-pytorch-amd:
222
+ name: "Latest PyTorch (AMD) [dev]"
223
+ runs-on:
224
+ group: aws-general-8-plus
225
+ steps:
226
+ -
227
+ name: Set up Docker Buildx
228
+ uses: docker/setup-buildx-action@v3
229
+ -
230
+ name: Check out code
231
+ uses: actions/checkout@v4
232
+ -
233
+ name: Login to DockerHub
234
+ uses: docker/login-action@v3
235
+ with:
236
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
237
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
238
+ -
239
+ name: Build and push
240
+ uses: docker/build-push-action@v5
241
+ with:
242
+ context: ./docker/transformers-pytorch-amd-gpu
243
+ build-args: |
244
+ REF=main
245
+ push: true
246
+ tags: huggingface/transformers-pytorch-amd-gpu${{ inputs.image_postfix }}
247
+ # Push CI images still need to be re-built daily
248
+ -
249
+ name: Build and push (for Push CI) in a daily basis
250
+ # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
251
+ # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
252
+ if: inputs.image_postfix != '-push-ci'
253
+ uses: docker/build-push-action@v5
254
+ with:
255
+ context: ./docker/transformers-pytorch-amd-gpu
256
+ build-args: |
257
+ REF=main
258
+ push: true
259
+ tags: huggingface/transformers-pytorch-amd-gpu-push-ci
260
+
261
+ - name: Post to Slack
262
+ if: always()
263
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
264
+ with:
265
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
266
+ title: 🤗 Results of the huggingface/transformers-pytorch-amd-gpu-push-ci build
267
+ status: ${{ job.status }}
268
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
269
+
270
+ latest-tensorflow:
271
+ name: "Latest TensorFlow [dev]"
272
+ # Push CI doesn't need this image
273
+ if: inputs.image_postfix != '-push-ci'
274
+ runs-on:
275
+ group: aws-general-8-plus
276
+ steps:
277
+ -
278
+ name: Set up Docker Buildx
279
+ uses: docker/setup-buildx-action@v3
280
+ -
281
+ name: Check out code
282
+ uses: actions/checkout@v4
283
+ -
284
+ name: Login to DockerHub
285
+ uses: docker/login-action@v3
286
+ with:
287
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
288
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
289
+ -
290
+ name: Build and push
291
+ uses: docker/build-push-action@v5
292
+ with:
293
+ context: ./docker/transformers-tensorflow-gpu
294
+ build-args: |
295
+ REF=main
296
+ push: true
297
+ tags: huggingface/transformers-tensorflow-gpu
298
+
299
+ - name: Post to Slack
300
+ if: always()
301
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
302
+ with:
303
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
304
+ title: 🤗 Results of the huggingface/transformers-tensorflow-gpu build
305
+ status: ${{ job.status }}
306
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
307
+
308
+ latest-pytorch-deepspeed-amd:
309
+ name: "PyTorch + DeepSpeed (AMD) [dev]"
310
+ runs-on:
311
+ group: aws-general-8-plus
312
+ steps:
313
+ -
314
+ name: Set up Docker Buildx
315
+ uses: docker/setup-buildx-action@v3
316
+ -
317
+ name: Check out code
318
+ uses: actions/checkout@v4
319
+ -
320
+ name: Login to DockerHub
321
+ uses: docker/login-action@v3
322
+ with:
323
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
324
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
325
+ -
326
+ name: Build and push
327
+ uses: docker/build-push-action@v5
328
+ with:
329
+ context: ./docker/transformers-pytorch-deepspeed-amd-gpu
330
+ build-args: |
331
+ REF=main
332
+ push: true
333
+ tags: huggingface/transformers-pytorch-deepspeed-amd-gpu${{ inputs.image_postfix }}
334
+ # Push CI images still need to be re-built daily
335
+ -
336
+ name: Build and push (for Push CI) in a daily basis
337
+ # This condition allows `schedule` events, or `push` events that trigger this workflow NOT via `workflow_call`.
338
+ # The later case is useful for manual image building for debugging purpose. Use another tag in this case!
339
+ if: inputs.image_postfix != '-push-ci'
340
+ uses: docker/build-push-action@v5
341
+ with:
342
+ context: ./docker/transformers-pytorch-deepspeed-amd-gpu
343
+ build-args: |
344
+ REF=main
345
+ push: true
346
+ tags: huggingface/transformers-pytorch-deepspeed-amd-gpu-push-ci
347
+
348
+ - name: Post to Slack
349
+ if: always()
350
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
351
+ with:
352
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
353
+ title: 🤗 Results of the transformers-pytorch-deepspeed-amd-gpu build
354
+ status: ${{ job.status }}
355
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
356
+
357
+ latest-quantization-torch-docker:
358
+ name: "Latest Pytorch + Quantization [dev]"
359
+ # Push CI doesn't need this image
360
+ if: inputs.image_postfix != '-push-ci'
361
+ runs-on:
362
+ group: aws-general-8-plus
363
+ steps:
364
+ -
365
+ name: Set up Docker Buildx
366
+ uses: docker/setup-buildx-action@v3
367
+ -
368
+ name: Check out code
369
+ uses: actions/checkout@v4
370
+ -
371
+ name: Login to DockerHub
372
+ uses: docker/login-action@v3
373
+ with:
374
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
375
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
376
+ -
377
+ name: Build and push
378
+ uses: docker/build-push-action@v5
379
+ with:
380
+ context: ./docker/transformers-quantization-latest-gpu
381
+ build-args: |
382
+ REF=main
383
+ push: true
384
+ tags: huggingface/transformers-quantization-latest-gpu${{ inputs.image_postfix }}
385
+
386
+ - name: Post to Slack
387
+ if: always()
388
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
389
+ with:
390
+ slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
391
+ title: 🤗 Results of the transformers-quantization-latest-gpu build
392
+ status: ${{ job.status }}
393
+ slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
docs/transformers/.github/workflows/build-nightly-ci-docker-images.yml ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Build docker images (Nightly CI)
2
+
3
+ on:
4
+ workflow_call:
5
+ push:
6
+ branches:
7
+ - build_nightly_ci_docker_image*
8
+
9
+ concurrency:
10
+ group: docker-images-builds
11
+ cancel-in-progress: false
12
+
13
+ jobs:
14
+ latest-with-torch-nightly-docker:
15
+ name: "Nightly PyTorch + Stable TensorFlow"
16
+ runs-on:
17
+ group: aws-general-8-plus
18
+ steps:
19
+ -
20
+ name: Set up Docker Buildx
21
+ uses: docker/setup-buildx-action@v2
22
+ -
23
+ name: Check out code
24
+ uses: actions/checkout@v4
25
+ -
26
+ name: Login to DockerHub
27
+ uses: docker/login-action@v2
28
+ with:
29
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
30
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
31
+ -
32
+ name: Build and push
33
+ uses: docker/build-push-action@v3
34
+ with:
35
+ context: ./docker/transformers-all-latest-gpu
36
+ build-args: |
37
+ REF=main
38
+ PYTORCH=pre
39
+ push: true
40
+ tags: huggingface/transformers-all-latest-torch-nightly-gpu
41
+
42
+ nightly-torch-deepspeed-docker:
43
+ name: "Nightly PyTorch + DeepSpeed"
44
+ runs-on:
45
+ group: aws-g4dn-2xlarge-cache
46
+ steps:
47
+ -
48
+ name: Set up Docker Buildx
49
+ uses: docker/setup-buildx-action@v2
50
+ -
51
+ name: Check out code
52
+ uses: actions/checkout@v4
53
+ -
54
+ name: Login to DockerHub
55
+ uses: docker/login-action@v2
56
+ with:
57
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
58
+ password: ${{ secrets.DOCKERHUB_PASSWORD }}
59
+ -
60
+ name: Build and push
61
+ uses: docker/build-push-action@v3
62
+ with:
63
+ context: ./docker/transformers-pytorch-deepspeed-nightly-gpu
64
+ build-args: |
65
+ REF=main
66
+ push: true
67
+ tags: huggingface/transformers-pytorch-deepspeed-nightly-gpu
docs/transformers/.github/workflows/change_pr_to_draft.yml ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Change PR to draft
2
+
3
+ on:
4
+ pull_request_target:
5
+ types: [opened, reopened]
6
+
7
+ jobs:
8
+ convert_pr_to_draft:
9
+ runs-on: ubuntu-22.04
10
+ name: Convert PR to draft
11
+ permissions:
12
+ pull-requests: write
13
+ contents: write
14
+ if: github.event.pull_request.draft == false
15
+ steps:
16
+ - name: Convert PR to draft
17
+ shell: bash
18
+ env:
19
+ PR_NUMBER: ${{ github.event.number }}
20
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
21
+ REPO: ${{ github.repository }}
22
+ run: |
23
+ echo $PR_NUMBER
24
+ gh pr ready $PR_NUMBER --repo $REPO --undo
25
+ gh pr comment $PR_NUMBER --repo $REPO --body "Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the \`Ready for review\` button (at the bottom of the PR page). This will assign reviewers and trigger CI."
docs/transformers/.github/workflows/check_failed_model_tests.yml ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Process failed tests
2
+
3
+ on:
4
+ workflow_call:
5
+ inputs:
6
+ docker:
7
+ required: true
8
+ type: string
9
+ start_sha:
10
+ required: true
11
+ type: string
12
+
13
+
14
+ env:
15
+ HF_HOME: /mnt/cache
16
+ TRANSFORMERS_IS_CI: yes
17
+ OMP_NUM_THREADS: 8
18
+ MKL_NUM_THREADS: 8
19
+ RUN_SLOW: yes
20
+ # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
21
+ # This token is created under the bot `hf-transformers-bot`.
22
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
23
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
24
+ TF_FORCE_GPU_ALLOW_GROWTH: true
25
+ CUDA_VISIBLE_DEVICES: 0,1
26
+
27
+
28
+ jobs:
29
+ run_models_gpu:
30
+ name: " "
31
+ runs-on:
32
+ group: aws-g4dn-2xlarge-cache
33
+ container:
34
+ image: ${{ inputs.docker }}
35
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
36
+ steps:
37
+ - uses: actions/download-artifact@v4
38
+ with:
39
+ name: ci_results_run_models_gpu
40
+ path: /transformers/ci_results_run_models_gpu
41
+
42
+ - name: Update clone
43
+ working-directory: /transformers
44
+ run: git fetch && git checkout ${{ github.sha }}
45
+
46
+ - name: Get target commit
47
+ working-directory: /transformers/utils
48
+ run: |
49
+ echo "END_SHA=$(TOKEN=${{ secrets.ACCESS_REPO_INFO_TOKEN }} python3 -c 'import os; from get_previous_daily_ci import get_last_daily_ci_run_commit; commit=get_last_daily_ci_run_commit(token=os.environ["TOKEN"]); print(commit)')" >> $GITHUB_ENV
50
+
51
+ - name: Checkout to `start_sha`
52
+ working-directory: /transformers
53
+ run: git fetch && git checkout ${{ inputs.start_sha }}
54
+
55
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
56
+ working-directory: /transformers
57
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
58
+
59
+ - name: NVIDIA-SMI
60
+ run: |
61
+ nvidia-smi
62
+
63
+ - name: Environment
64
+ working-directory: /transformers
65
+ run: |
66
+ python3 utils/print_env.py
67
+
68
+ - name: Show installed libraries and their versions
69
+ working-directory: /transformers
70
+ run: pip freeze
71
+
72
+ - name: Check failed tests
73
+ working-directory: /transformers
74
+ run: python3 utils/check_bad_commit.py --start_commit ${{ inputs.start_sha }} --end_commit ${{ env.END_SHA }} --file ci_results_run_models_gpu/new_model_failures.json --output_file new_model_failures_with_bad_commit.json
75
+
76
+ - name: Show results
77
+ working-directory: /transformers
78
+ run: |
79
+ ls -l new_model_failures_with_bad_commit.json
80
+ cat new_model_failures_with_bad_commit.json
81
+
82
+ - name: Checkout back
83
+ working-directory: /transformers
84
+ run: |
85
+ git checkout ${{ inputs.start_sha }}
86
+
87
+ - name: Process report
88
+ shell: bash
89
+ working-directory: /transformers
90
+ env:
91
+ TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
92
+ run: |
93
+ python3 utils/process_bad_commit_report.py
94
+
95
+ - name: Process report
96
+ shell: bash
97
+ working-directory: /transformers
98
+ env:
99
+ TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
100
+ run: |
101
+ {
102
+ echo 'REPORT_TEXT<<EOF'
103
+ python3 utils/process_bad_commit_report.py
104
+ echo EOF
105
+ } >> "$GITHUB_ENV"
106
+
107
+ - name: Send processed report
108
+ if: ${{ !endsWith(env.REPORT_TEXT, '{}') }}
109
+ uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
110
+ with:
111
+ # Slack channel id, channel name, or user id to post message.
112
+ # See also: https://api.slack.com/methods/chat.postMessage#channels
113
+ channel-id: '#transformers-ci-feedback-tests'
114
+ # For posting a rich message using Block Kit
115
+ payload: |
116
+ {
117
+ "blocks": [
118
+ {
119
+ "type": "section",
120
+ "text": {
121
+ "type": "mrkdwn",
122
+ "text": "${{ env.REPORT_TEXT }}"
123
+ }
124
+ }
125
+ ]
126
+ }
127
+ env:
128
+ SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
docs/transformers/.github/workflows/check_tiny_models.yml ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Check Tiny Models
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - check_tiny_models*
7
+ repository_dispatch:
8
+ schedule:
9
+ - cron: "0 2 * * *"
10
+
11
+ env:
12
+ TOKEN: ${{ secrets.TRANSFORMERS_HUB_BOT_HF_TOKEN }}
13
+
14
+ jobs:
15
+ check_tiny_models:
16
+ name: Check tiny models
17
+ runs-on: ubuntu-22.04
18
+ steps:
19
+ - name: Checkout transformers
20
+ uses: actions/checkout@v4
21
+ with:
22
+ fetch-depth: 2
23
+
24
+ - uses: actions/checkout@v4
25
+ - name: Set up Python 3.8
26
+ uses: actions/setup-python@v5
27
+ with:
28
+ # Semantic version range syntax or exact version of a Python version
29
+ python-version: '3.8'
30
+ # Optional - x64 or x86 architecture, defaults to x64
31
+ architecture: 'x64'
32
+
33
+ - name: Install
34
+ run: |
35
+ sudo apt-get -y update && sudo apt-get install -y libsndfile1-dev espeak-ng cmake
36
+ pip install --upgrade pip
37
+ python -m pip install -U .[sklearn,torch,testing,sentencepiece,torch-speech,vision,timm,video,tf-cpu]
38
+ pip install tensorflow_probability
39
+ python -m pip install -U 'natten<0.15.0'
40
+
41
+ - name: Create all tiny models (locally)
42
+ run: |
43
+ python utils/create_dummy_models.py tiny_local_models --all --num_workers 2
44
+
45
+ - name: Local tiny model reports artifacts
46
+ if: ${{ always() }}
47
+ uses: actions/upload-artifact@v4
48
+ with:
49
+ name: tiny_local_model_creation_reports
50
+ path: tiny_local_models/reports
51
+
52
+ # GitHub-hosted runners have 2-core CPUs
53
+ - name: Run pipeline tests against all new (local) tiny models
54
+ run: |
55
+ OMP_NUM_THREADS=1 TRANSFORMERS_TINY_MODEL_PATH=tiny_local_models python -m pytest --max-worker-restart=0 -n 2 --dist=loadfile -s -rA --make-reports=tests_pipelines tests/models -m is_pipeline_test -k "test_pipeline_" | tee tests_output.txt
56
+
57
+ - name: Test suite reports artifacts
58
+ if: ${{ always() }}
59
+ uses: actions/upload-artifact@v4
60
+ with:
61
+ name: tiny_local_model_creation_reports
62
+ path: reports/tests_pipelines
63
+
64
+ - name: Create + Upload tiny models for new model architecture(s)
65
+ run: |
66
+ python utils/update_tiny_models.py --num_workers 2
67
+
68
+ - name: Full report
69
+ run: cat tiny_models/reports/tiny_model_creation_report.json
70
+
71
+ - name: Failure report
72
+ run: cat tiny_models/reports/simple_failed_report.txt
73
+
74
+ - name: Summary report
75
+ run: cat tiny_models/reports/tiny_model_summary.json
76
+
77
+ - name: New tiny model creation reports artifacts
78
+ if: ${{ always() }}
79
+ uses: actions/upload-artifact@v4
80
+ with:
81
+ name: tiny_model_creation_reports
82
+ path: tiny_models/reports
docs/transformers/.github/workflows/doctest_job.yml ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Doctest job
2
+
3
+ on:
4
+ workflow_call:
5
+ inputs:
6
+ job_splits:
7
+ required: true
8
+ type: string
9
+ split_keys:
10
+ required: true
11
+ type: string
12
+
13
+ env:
14
+ HF_HOME: /mnt/cache
15
+ TRANSFORMERS_IS_CI: yes
16
+ RUN_SLOW: yes
17
+ OMP_NUM_THREADS: 16
18
+ MKL_NUM_THREADS: 16
19
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
20
+ TF_FORCE_GPU_ALLOW_GROWTH: true
21
+
22
+ jobs:
23
+ run_doctests:
24
+ name: " "
25
+ strategy:
26
+ max-parallel: 8 # 8 jobs at a time
27
+ fail-fast: false
28
+ matrix:
29
+ split_keys: ${{ fromJson(inputs.split_keys) }}
30
+ runs-on:
31
+ group: aws-g4dn-2xlarge-cache
32
+ container:
33
+ image: huggingface/transformers-all-latest-gpu
34
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
35
+ steps:
36
+ - name: Update clone
37
+ working-directory: /transformers
38
+ run: git fetch && git checkout ${{ github.sha }}
39
+
40
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
41
+ working-directory: /transformers
42
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .[flax]
43
+
44
+ - name: GPU visibility
45
+ working-directory: /transformers
46
+ run: |
47
+ python3 utils/print_env.py
48
+
49
+ - name: Show installed libraries and their versions
50
+ run: pip freeze
51
+
52
+ - name: Get doctest files
53
+ working-directory: /transformers
54
+ run: |
55
+ echo "${{ toJson(fromJson(inputs.job_splits)[matrix.split_keys]) }}" > doc_tests.txt
56
+ cat doc_tests.txt
57
+
58
+ - name: Set `split_keys`
59
+ shell: bash
60
+ run: |
61
+ echo "${{ matrix.split_keys }}"
62
+ split_keys=${{ matrix.split_keys }}
63
+ split_keys=${split_keys//'/'/'_'}
64
+ echo "split_keys"
65
+ echo "split_keys=$split_keys" >> $GITHUB_ENV
66
+
67
+ - name: Run doctests
68
+ working-directory: /transformers
69
+ run: |
70
+ cat doc_tests.txt
71
+ python3 -m pytest -v --make-reports doc_tests_gpu_${{ env.split_keys }} --doctest-modules $(cat doc_tests.txt) -sv --doctest-continue-on-failure --doctest-glob="*.md"
72
+
73
+ - name: Failure short reports
74
+ if: ${{ failure() }}
75
+ continue-on-error: true
76
+ run: cat /transformers/reports/doc_tests_gpu_${{ env.split_keys }}/failures_short.txt
77
+
78
+ - name: "Test suite reports artifacts: doc_tests_gpu_test_reports_${{ env.split_keys }}"
79
+ if: ${{ always() }}
80
+ uses: actions/upload-artifact@v4
81
+ with:
82
+ name: doc_tests_gpu_test_reports_${{ env.split_keys }}
83
+ path: /transformers/reports/doc_tests_gpu_${{ env.split_keys }}
docs/transformers/.github/workflows/model_jobs.yml ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: model jobs
2
+
3
+ on:
4
+ workflow_call:
5
+ inputs:
6
+ folder_slices:
7
+ required: true
8
+ type: string
9
+ machine_type:
10
+ required: true
11
+ type: string
12
+ slice_id:
13
+ required: true
14
+ type: number
15
+ runner:
16
+ required: true
17
+ type: string
18
+ docker:
19
+ required: true
20
+ type: string
21
+ report_name_prefix:
22
+ required: false
23
+ default: run_models_gpu
24
+ type: string
25
+
26
+ env:
27
+ HF_HOME: /mnt/cache
28
+ TRANSFORMERS_IS_CI: yes
29
+ OMP_NUM_THREADS: 8
30
+ MKL_NUM_THREADS: 8
31
+ RUN_SLOW: yes
32
+ # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
33
+ # This token is created under the bot `hf-transformers-bot`.
34
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
35
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
36
+ TF_FORCE_GPU_ALLOW_GROWTH: true
37
+ CUDA_VISIBLE_DEVICES: 0,1
38
+
39
+ jobs:
40
+ run_models_gpu:
41
+ name: " "
42
+ strategy:
43
+ max-parallel: 8
44
+ fail-fast: false
45
+ matrix:
46
+ folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}
47
+ runs-on:
48
+ group: '${{ inputs.machine_type }}'
49
+ container:
50
+ image: ${{ inputs.docker }}
51
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
52
+ steps:
53
+ - name: Echo input and matrix info
54
+ shell: bash
55
+ run: |
56
+ echo "${{ inputs.folder_slices }}"
57
+ echo "${{ matrix.folders }}"
58
+ echo "${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}"
59
+
60
+ - name: Echo folder ${{ matrix.folders }}
61
+ shell: bash
62
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
63
+ # set the artifact folder names (because the character `/` is not allowed).
64
+ run: |
65
+ echo "${{ matrix.folders }}"
66
+ matrix_folders=${{ matrix.folders }}
67
+ matrix_folders=${matrix_folders/'models/'/'models_'}
68
+ echo "$matrix_folders"
69
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
70
+
71
+ - name: Update clone
72
+ working-directory: /transformers
73
+ run: git fetch && git checkout ${{ github.sha }}
74
+
75
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
76
+ working-directory: /transformers
77
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
78
+
79
+ - name: Update / Install some packages (for Past CI)
80
+ if: ${{ contains(inputs.docker, '-past-') }}
81
+ working-directory: /transformers
82
+ run: |
83
+ python3 -m pip install -U datasets
84
+
85
+ - name: Update / Install some packages (for Past CI)
86
+ if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}
87
+ working-directory: /transformers
88
+ run: |
89
+ python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
90
+
91
+ - name: NVIDIA-SMI
92
+ run: |
93
+ nvidia-smi
94
+
95
+ - name: Environment
96
+ working-directory: /transformers
97
+ run: |
98
+ python3 utils/print_env.py
99
+
100
+ - name: Show installed libraries and their versions
101
+ working-directory: /transformers
102
+ run: pip freeze
103
+
104
+ - name: Set `machine_type` for report and artifact names
105
+ working-directory: /transformers
106
+ shell: bash
107
+ run: |
108
+ echo "${{ inputs.machine_type }}"
109
+
110
+ if [ "${{ inputs.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
111
+ machine_type=single-gpu
112
+ elif [ "${{ inputs.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
113
+ machine_type=multi-gpu
114
+ else
115
+ machine_type=${{ inputs.machine_type }}
116
+ fi
117
+
118
+ echo "$machine_type"
119
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
120
+
121
+ - name: Run all tests on GPU
122
+ working-directory: /transformers
123
+ run: python3 -m pytest -rsfE -v --make-reports=${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
124
+
125
+ - name: Failure short reports
126
+ if: ${{ failure() }}
127
+ continue-on-error: true
128
+ run: cat /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports/failures_short.txt
129
+
130
+ - name: Run test
131
+ shell: bash
132
+ run: |
133
+ mkdir -p /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports
134
+ echo "hello" > /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports/hello.txt
135
+ echo "${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports"
136
+
137
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports"
138
+ if: ${{ always() }}
139
+ uses: actions/upload-artifact@v4
140
+ with:
141
+ name: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports
142
+ path: /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports
docs/transformers/.github/workflows/model_jobs_amd.yml ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: model jobs
2
+
3
+ on:
4
+ workflow_call:
5
+ inputs:
6
+ folder_slices:
7
+ required: true
8
+ type: string
9
+ machine_type:
10
+ required: true
11
+ type: string
12
+ slice_id:
13
+ required: true
14
+ type: number
15
+ runner:
16
+ required: true
17
+ type: string
18
+ docker:
19
+ required: true
20
+ type: string
21
+
22
+ env:
23
+ HF_HOME: /mnt/cache
24
+ TRANSFORMERS_IS_CI: yes
25
+ OMP_NUM_THREADS: 8
26
+ MKL_NUM_THREADS: 8
27
+ RUN_SLOW: yes
28
+ # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
29
+ # This token is created under the bot `hf-transformers-bot`.
30
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
31
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
32
+ TF_FORCE_GPU_ALLOW_GROWTH: true
33
+ CUDA_VISIBLE_DEVICES: 0,1
34
+
35
+ jobs:
36
+ run_models_gpu:
37
+ name: " "
38
+ strategy:
39
+ max-parallel: 1 # For now, not to parallelize. Can change later if it works well.
40
+ fail-fast: false
41
+ matrix:
42
+ folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}
43
+ runs-on: ['${{ inputs.machine_type }}', self-hosted, amd-gpu, '${{ inputs.runner }}']
44
+ container:
45
+ image: ${{ inputs.docker }}
46
+ options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
47
+ steps:
48
+ - name: Echo input and matrix info
49
+ shell: bash
50
+ run: |
51
+ echo "${{ inputs.folder_slices }}"
52
+ echo "${{ matrix.folders }}"
53
+ echo "${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}"
54
+
55
+ - name: Echo folder ${{ matrix.folders }}
56
+ shell: bash
57
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
58
+ # set the artifact folder names (because the character `/` is not allowed).
59
+ run: |
60
+ echo "${{ matrix.folders }}"
61
+ matrix_folders=${{ matrix.folders }}
62
+ matrix_folders=${matrix_folders/'models/'/'models_'}
63
+ echo "$matrix_folders"
64
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
65
+
66
+ - name: Update clone
67
+ working-directory: /transformers
68
+ run: git fetch && git checkout ${{ github.sha }}
69
+
70
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
71
+ working-directory: /transformers
72
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
73
+
74
+ - name: Update / Install some packages (for Past CI)
75
+ if: ${{ contains(inputs.docker, '-past-') }}
76
+ working-directory: /transformers
77
+ run: |
78
+ python3 -m pip install -U datasets
79
+
80
+ - name: Update / Install some packages (for Past CI)
81
+ if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}
82
+ working-directory: /transformers
83
+ run: |
84
+ python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
85
+
86
+ - name: ROCM-SMI
87
+ run: |
88
+ rocm-smi
89
+
90
+ - name: ROCM-INFO
91
+ run: |
92
+ rocminfo | grep "Agent" -A 14
93
+
94
+ - name: Show ROCR environment
95
+ run: |
96
+ echo "ROCR: $ROCR_VISIBLE_DEVICES"
97
+
98
+ - name: Environment
99
+ working-directory: /transformers
100
+ run: |
101
+ python3 utils/print_env.py
102
+
103
+ - name: Show installed libraries and their versions
104
+ working-directory: /transformers
105
+ run: pip freeze
106
+
107
+ - name: Run all tests on GPU
108
+ working-directory: /transformers
109
+ run: python3 -m pytest -rsfE -v --make-reports=${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }} -m "not not_device_test"
110
+
111
+ - name: Failure short reports
112
+ if: ${{ failure() }}
113
+ continue-on-error: true
114
+ run: cat /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
115
+
116
+ - name: Run test
117
+ shell: bash
118
+ run: |
119
+ mkdir -p /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
120
+ echo "hello" > /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
121
+ echo "${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
122
+
123
+ - name: "Test suite reports artifacts: ${{ inputs.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
124
+ if: ${{ always() }}
125
+ uses: actions/upload-artifact@v4
126
+ with:
127
+ name: ${{ inputs.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
128
+ path: /transformers/reports/${{ inputs.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
docs/transformers/.github/workflows/new_model_pr_merged_notification.yml ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Used to notify core maintainers about new model PR being merged
2
+ name: New model PR merged notification
3
+
4
+ on:
5
+ push:
6
+ branches:
7
+ - main
8
+ paths:
9
+ - 'src/transformers/models/*/modeling_*'
10
+
11
+ jobs:
12
+ notify_new_model:
13
+ name: Notify new model
14
+ runs-on: ubuntu-22.04
15
+ steps:
16
+ - uses: actions/checkout@v4
17
+ with:
18
+ fetch-depth: 0
19
+ - name: Check new model
20
+ shell: bash
21
+ run: |
22
+ python -m pip install gitpython
23
+ python -c 'from utils.pr_slow_ci_models import get_new_model; new_model = get_new_model(diff_with_last_commit=True); print(new_model)' | tee output.txt
24
+ echo "NEW_MODEL=$(tail -n 1 output.txt)" >> $GITHUB_ENV
25
+ echo "COMMIT_SHA=$(git log -1 --format=%H)" >> $GITHUB_ENV
26
+
27
+ - name: print commit sha
28
+ if: ${{ env.NEW_MODEL != ''}}
29
+ shell: bash
30
+ run: |
31
+ echo "$COMMIT_SHA"
32
+
33
+ - name: print new model
34
+ if: ${{ env.NEW_MODEL != ''}}
35
+ shell: bash
36
+ run: |
37
+ echo "$NEW_MODEL"
38
+
39
+ - name: Notify
40
+ if: ${{ env.NEW_MODEL != ''}}
41
+ uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
42
+ with:
43
+ # Slack channel id, channel name, or user id to post message.
44
+ # See also: https://api.slack.com/methods/chat.postMessage#channels
45
+ channel-id: transformers-new-model-notification
46
+ # For posting a rich message using Block Kit
47
+ payload: |
48
+ {
49
+ "blocks": [
50
+ {
51
+ "type": "header",
52
+ "text": {
53
+ "type": "plain_text",
54
+ "text": "New model!",
55
+ "emoji": true
56
+ }
57
+ },
58
+ {
59
+ "type": "section",
60
+ "text": {
61
+ "type": "mrkdwn",
62
+ "text": "<https://github.com/huggingface/transformers/commit/${{ env.COMMIT_SHA }}|New model: ${{ env.NEW_MODEL }}> GH_ArthurZucker, GH_lysandrejik, GH_ydshieh"
63
+ }
64
+ }
65
+ ]
66
+ }
67
+ env:
68
+ SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
docs/transformers/.github/workflows/push-important-models.yml ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Slow tests on important models (on Push - A10)
2
+
3
+ on:
4
+ push:
5
+ branches: [ main ]
6
+
7
+ env:
8
+ OUTPUT_SLACK_CHANNEL_ID: "C06L2SGMEEA"
9
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
10
+ HF_HOME: /mnt/cache
11
+ TRANSFORMERS_IS_CI: yes
12
+ OMP_NUM_THREADS: 8
13
+ MKL_NUM_THREADS: 8
14
+ RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
15
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
16
+ TF_FORCE_GPU_ALLOW_GROWTH: true
17
+
18
+ jobs:
19
+ get_modified_models:
20
+ name: "Get all modified files"
21
+ runs-on: ubuntu-latest
22
+ outputs:
23
+ matrix: ${{ steps.set-matrix.outputs.matrix }}
24
+ steps:
25
+ - name: Check out code
26
+ uses: actions/checkout@v4
27
+
28
+ - name: Get changed files
29
+ id: changed-files
30
+ uses: tj-actions/changed-files@1c8e6069583811afb28f97afeaf8e7da80c6be5c
31
+ with:
32
+ files: src/transformers/models/**
33
+
34
+ - name: Run step if only the files listed above change
35
+ if: steps.changed-files.outputs.any_changed == 'true'
36
+ id: set-matrix
37
+ env:
38
+ ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
39
+ run: |
40
+ model_arrays=()
41
+ for file in $ALL_CHANGED_FILES; do
42
+ model_path="${file#*models/}"
43
+ model_path="models/${model_path%%/*}"
44
+ if grep -qFx "$model_path" utils/important_models.txt; then
45
+ # Append the file to the matrix string
46
+ model_arrays+=("$model_path")
47
+ fi
48
+ done
49
+ matrix_string=$(printf '"%s", ' "${model_arrays[@]}" | sed 's/, $//')
50
+ echo "matrix=[$matrix_string]" >> $GITHUB_OUTPUT
51
+ test_modified_files:
52
+ needs: get_modified_models
53
+ name: Slow & FA2 tests
54
+ runs-on:
55
+ group: aws-g5-4xlarge-cache
56
+ container:
57
+ image: huggingface/transformers-all-latest-gpu
58
+ options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
59
+ if: ${{ needs.get_modified_models.outputs.matrix != '[]' && needs.get_modified_models.outputs.matrix != '' && fromJson(needs.get_modified_models.outputs.matrix)[0] != null }}
60
+ strategy:
61
+ fail-fast: false
62
+ matrix:
63
+ model-name: ${{ fromJson(needs.get_modified_models.outputs.matrix) }}
64
+
65
+ steps:
66
+ - name: Check out code
67
+ uses: actions/checkout@v4
68
+
69
+ - name: Install locally transformers & other libs
70
+ run: |
71
+ apt install sudo
72
+ sudo -H pip install --upgrade pip
73
+ sudo -H pip uninstall -y transformers
74
+ sudo -H pip install -U -e ".[testing]"
75
+ MAX_JOBS=4 pip install flash-attn --no-build-isolation
76
+ pip install bitsandbytes
77
+
78
+ - name: NVIDIA-SMI
79
+ run: |
80
+ nvidia-smi
81
+
82
+ - name: Show installed libraries and their versions
83
+ run: pip freeze
84
+
85
+ - name: Run FA2 tests
86
+ id: run_fa2_tests
87
+ run:
88
+ pytest -rsfE -m "flash_attn_test" --make-reports=${{ matrix.model-name }}_fa2_tests/ tests/${{ matrix.model-name }}/test_modeling_*
89
+
90
+ - name: "Test suite reports artifacts: ${{ matrix.model-name }}_fa2_tests"
91
+ if: ${{ always() }}
92
+ uses: actions/upload-artifact@v4
93
+ with:
94
+ name: ${{ matrix.model-name }}_fa2_tests
95
+ path: /transformers/reports/${{ matrix.model-name }}_fa2_tests
96
+
97
+ - name: Post to Slack
98
+ if: always()
99
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
100
+ with:
101
+ slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}
102
+ title: 🤗 Results of the FA2 tests - ${{ matrix.model-name }}
103
+ status: ${{ steps.run_fa2_tests.conclusion}}
104
+ slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}
105
+
106
+ - name: Run integration tests
107
+ id: run_integration_tests
108
+ if: always()
109
+ run:
110
+ pytest -rsfE -k "IntegrationTest" --make-reports=tests_integration_${{ matrix.model-name }} tests/${{ matrix.model-name }}/test_modeling_*
111
+
112
+ - name: "Test suite reports artifacts: tests_integration_${{ matrix.model-name }}"
113
+ if: ${{ always() }}
114
+ uses: actions/upload-artifact@v4
115
+ with:
116
+ name: tests_integration_${{ matrix.model-name }}
117
+ path: /transformers/reports/tests_integration_${{ matrix.model-name }}
118
+
119
+ - name: Post to Slack
120
+ if: always()
121
+ uses: huggingface/hf-workflows/.github/actions/post-slack@main
122
+ with:
123
+ slack_channel: ${{ env.OUTPUT_SLACK_CHANNEL_ID }}
124
+ title: 🤗 Results of the Integration tests - ${{ matrix.model-name }}
125
+ status: ${{ steps.run_integration_tests.conclusion}}
126
+ slack_token: ${{ secrets.CI_SLACK_BOT_TOKEN }}
127
+
128
+ - name: Tailscale # In order to be able to SSH when a test fails
129
+ if: ${{ runner.debug == '1'}}
130
+ uses: huggingface/tailscale-action@v1
131
+ with:
132
+ authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
133
+ slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
134
+ slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
135
+ waitForSSH: true
docs/transformers/.github/workflows/release-conda.yml ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Release - Conda
2
+
3
+ on:
4
+ push:
5
+ tags:
6
+ - v*
7
+ branches:
8
+ - conda_*
9
+
10
+ env:
11
+ ANACONDA_API_TOKEN: ${{ secrets.ANACONDA_API_TOKEN }}
12
+
13
+ jobs:
14
+ build_and_package:
15
+ runs-on: ubuntu-22.04
16
+ defaults:
17
+ run:
18
+ shell: bash -l {0}
19
+
20
+ steps:
21
+ - name: Checkout repository
22
+ uses: actions/checkout@v4
23
+
24
+ - name: Install miniconda
25
+ uses: conda-incubator/setup-miniconda@v2
26
+ with:
27
+ auto-update-conda: true
28
+ auto-activate-base: false
29
+ python-version: 3.8
30
+ activate-environment: "build-transformers"
31
+ channels: huggingface
32
+
33
+ - name: Setup conda env
34
+ run: |
35
+ conda install -c defaults anaconda-client conda-build
36
+
37
+ - name: Extract version
38
+ run: echo "TRANSFORMERS_VERSION=`python setup.py --version`" >> $GITHUB_ENV
39
+
40
+ - name: Build conda packages
41
+ run: |
42
+ conda info
43
+ conda list
44
+ conda-build .github/conda
45
+
46
+ - name: Upload to Anaconda
47
+ run: anaconda upload `conda-build .github/conda --output` --force
docs/transformers/.github/workflows/self-comment-ci.yml ADDED
@@ -0,0 +1,416 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: PR comment GitHub CI
2
+
3
+ on:
4
+ issue_comment:
5
+ types:
6
+ - created
7
+ branches-ignore:
8
+ - main
9
+ concurrency:
10
+ group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow') }}
11
+ cancel-in-progress: true
12
+ permissions: read-all
13
+
14
+ env:
15
+ HF_HOME: /mnt/cache
16
+ TRANSFORMERS_IS_CI: yes
17
+ OMP_NUM_THREADS: 8
18
+ MKL_NUM_THREADS: 8
19
+ RUN_SLOW: yes
20
+ # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
21
+ # This token is created under the bot `hf-transformers-bot`.
22
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
23
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
24
+ TF_FORCE_GPU_ALLOW_GROWTH: true
25
+ CUDA_VISIBLE_DEVICES: 0,1
26
+
27
+ jobs:
28
+ get-pr-number:
29
+ runs-on: ubuntu-22.04
30
+ name: Get PR number
31
+ # For security: only allow team members to run
32
+ if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "qubvel", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "muellerzr", "eustlb", "MekkCyber"]'), github.actor) && (startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow')) }}
33
+ outputs:
34
+ PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}
35
+ steps:
36
+ - name: Get PR number
37
+ shell: bash
38
+ run: |
39
+ if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then
40
+ echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
41
+ else
42
+ echo "PR_NUMBER=" >> $GITHUB_ENV
43
+ fi
44
+
45
+ - name: Check PR number
46
+ shell: bash
47
+ run: |
48
+ echo "${{ env.PR_NUMBER }}"
49
+
50
+ - name: Set PR number
51
+ id: set_pr_number
52
+ run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"
53
+
54
+ get-sha:
55
+ runs-on: ubuntu-22.04
56
+ needs: get-pr-number
57
+ if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
58
+ outputs:
59
+ PR_HEAD_SHA: ${{ steps.get_sha.outputs.PR_HEAD_SHA }}
60
+ PR_MERGE_SHA: ${{ steps.get_sha.outputs.PR_MERGE_SHA }}
61
+ steps:
62
+ - uses: actions/checkout@v4
63
+ with:
64
+ fetch-depth: "0"
65
+ ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
66
+
67
+ - name: Get SHA (and verify timestamps against the issue comment date)
68
+ id: get_sha
69
+ env:
70
+ PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
71
+ COMMENT_DATE: ${{ github.event.comment.created_at }}
72
+ run: |
73
+ git fetch origin refs/pull/$PR_NUMBER/head:refs/remotes/pull/$PR_NUMBER/head
74
+ git checkout refs/remotes/pull/$PR_NUMBER/head
75
+ echo "PR_HEAD_SHA: $(git log -1 --format=%H)"
76
+ echo "PR_HEAD_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
77
+ git fetch origin refs/pull/$PR_NUMBER/merge:refs/remotes/pull/$PR_NUMBER/merge
78
+ git checkout refs/remotes/pull/$PR_NUMBER/merge
79
+ echo "PR_MERGE_SHA: $(git log -1 --format=%H)"
80
+ echo "PR_MERGE_SHA=$(git log -1 --format=%H)" >> "$GITHUB_OUTPUT"
81
+ PR_MERGE_COMMIT_TIMESTAMP=$(git log -1 --date=unix --format=%cd)
82
+ echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"
83
+ COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
84
+ echo "COMMENT_DATE: $COMMENT_DATE"
85
+ echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
86
+ if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
87
+ echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
88
+ exit -1;
89
+ fi
90
+
91
+ # use a python script to handle this complex logic
92
+ # case 1: `run-slow` (auto. infer with limited number of models, but in particular, new model)
93
+ # case 2: `run-slow model_1, model_2`
94
+ get-tests:
95
+ runs-on: ubuntu-22.04
96
+ needs: [get-pr-number, get-sha]
97
+ if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
98
+ outputs:
99
+ models: ${{ steps.models_to_run.outputs.models }}
100
+ quantizations: ${{ steps.models_to_run.outputs.quantizations }}
101
+ steps:
102
+ - uses: actions/checkout@v4
103
+ with:
104
+ fetch-depth: "0"
105
+ ref: "refs/pull/${{needs.get-pr-number.outputs.PR_NUMBER}}/merge"
106
+
107
+ - name: Verify merge commit SHA
108
+ env:
109
+ VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
110
+ run: |
111
+ PR_MERGE_SHA=$(git log -1 --format=%H)
112
+ if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
113
+ echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
114
+ exit -1;
115
+ fi
116
+
117
+ - name: Get models to test
118
+ env:
119
+ PR_COMMENT: ${{ github.event.comment.body }}
120
+ run: |
121
+ python -m pip install GitPython
122
+ python utils/pr_slow_ci_models.py --message "$PR_COMMENT" | tee output.txt
123
+ echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
124
+ python utils/pr_slow_ci_models.py --message "$PR_COMMENT" --quantization | tee output2.txt
125
+ echo "quantizations=$(tail -n 1 output2.txt)" >> $GITHUB_ENV
126
+
127
+ - name: Show models to test
128
+ id: models_to_run
129
+ run: |
130
+ echo "${{ env.models }}"
131
+ echo "models=${{ env.models }}" >> $GITHUB_ENV
132
+ echo "models=${{ env.models }}" >> $GITHUB_OUTPUT
133
+ echo "${{ env.quantizations }}"
134
+ echo "quantizations=${{ env.quantizations }}" >> $GITHUB_OUTPUT
135
+
136
+ reply_to_comment:
137
+ name: Reply to the comment
138
+ if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
139
+ needs: [get-pr-number, get-tests]
140
+ permissions:
141
+ pull-requests: write
142
+ runs-on: ubuntu-22.04
143
+ steps:
144
+ - name: Reply to the comment
145
+ env:
146
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
147
+ MODELS: ${{ needs.get-tests.outputs.models }}
148
+ BODY: "This comment contains run-slow, running the specified jobs:\n\nmodels: ${{ needs.get-tests.outputs.models }}\nquantizations: ${{ needs.get-tests.outputs.quantizations }}"
149
+ run: |
150
+ gh api \
151
+ --method POST \
152
+ -H "Accept: application/vnd.github+json" \
153
+ -H "X-GitHub-Api-Version: 2022-11-28" \
154
+ repos/${{ github.repository }}/issues/${{ needs.get-pr-number.outputs.PR_NUMBER }}/comments \
155
+ -f "body=This comment contains run-slow, running the specified jobs: ${{ env.BODY }} ..."
156
+
157
+ create_run:
158
+ name: Create run
159
+ if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
160
+ needs: [get-sha, get-tests, reply_to_comment]
161
+ permissions:
162
+ statuses: write
163
+ runs-on: ubuntu-22.04
164
+ steps:
165
+ - name: Create Run
166
+ id: create_run
167
+ env:
168
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
169
+ # Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.
170
+ # See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
171
+ GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
172
+ run: |
173
+ gh api \
174
+ --method POST \
175
+ -H "Accept: application/vnd.github+json" \
176
+ -H "X-GitHub-Api-Version: 2022-11-28" \
177
+ repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
178
+ -f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Slow CI job" -f "context=pytest/custom-tests"
179
+
180
+ run_models_gpu:
181
+ name: Run all tests for the model
182
+ if: ${{ needs.get-tests.outputs.models != '[]' }}
183
+ needs: [get-pr-number, get-sha, get-tests, create_run]
184
+ strategy:
185
+ fail-fast: false
186
+ matrix:
187
+ folders: ${{ fromJson(needs.get-tests.outputs.models) }}
188
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
189
+ runs-on:
190
+ group: '${{ matrix.machine_type }}'
191
+ container:
192
+ image: huggingface/transformers-all-latest-gpu
193
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
194
+ steps:
195
+ - name: Echo input and matrix info
196
+ shell: bash
197
+ run: |
198
+ echo "${{ matrix.folders }}"
199
+
200
+ - name: Echo folder ${{ matrix.folders }}
201
+ shell: bash
202
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
203
+ # set the artifact folder names (because the character `/` is not allowed).
204
+ run: |
205
+ echo "${{ matrix.folders }}"
206
+ matrix_folders=${{ matrix.folders }}
207
+ matrix_folders=${matrix_folders/'models/'/'models_'}
208
+ echo "$matrix_folders"
209
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
210
+
211
+ - name: Checkout to PR merge commit
212
+ working-directory: /transformers
213
+ run: |
214
+ git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
215
+ git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
216
+ git log -1 --format=%H
217
+
218
+ - name: Verify merge commit SHA
219
+ env:
220
+ VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
221
+ working-directory: /transformers
222
+ run: |
223
+ PR_MERGE_SHA=$(git log -1 --format=%H)
224
+ if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
225
+ echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
226
+ exit -1;
227
+ fi
228
+
229
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
230
+ working-directory: /transformers
231
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
232
+
233
+ - name: NVIDIA-SMI
234
+ run: |
235
+ nvidia-smi
236
+
237
+ - name: Set `machine_type` for report and artifact names
238
+ working-directory: /transformers
239
+ shell: bash
240
+ run: |
241
+ echo "${{ matrix.machine_type }}"
242
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
243
+ machine_type=single-gpu
244
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
245
+ machine_type=multi-gpu
246
+ else
247
+ machine_type=${{ matrix.machine_type }}
248
+ fi
249
+ echo "$machine_type"
250
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
251
+
252
+ - name: Environment
253
+ working-directory: /transformers
254
+ run: |
255
+ python3 utils/print_env.py
256
+
257
+ - name: Show installed libraries and their versions
258
+ working-directory: /transformers
259
+ run: pip freeze
260
+
261
+ - name: Run all tests on GPU
262
+ working-directory: /transformers
263
+ run: |
264
+ export CUDA_VISIBLE_DEVICES="$(python3 utils/set_cuda_devices_for_ci.py --test_folder ${{ matrix.folders }})"
265
+ echo $CUDA_VISIBLE_DEVICES
266
+ python3 -m pytest -v -rsfE --make-reports=${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
267
+
268
+ - name: Failure short reports
269
+ if: ${{ failure() }}
270
+ continue-on-error: true
271
+ run: cat /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
272
+
273
+ - name: Make sure report directory exists
274
+ shell: bash
275
+ run: |
276
+ mkdir -p /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
277
+ echo "hello" > /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/hello.txt
278
+ echo "${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports"
279
+
280
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
281
+ if: ${{ always() }}
282
+ uses: actions/upload-artifact@v4
283
+ with:
284
+ name: ${{ env.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
285
+ path: /transformers/reports/${{ env.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
286
+
287
+ run_quantization_torch_gpu:
288
+ name: Run all tests for a quantization
289
+ if: ${{ needs.get-tests.outputs.quantizations != '[]' }}
290
+ needs: [get-pr-number, get-sha, get-tests, create_run]
291
+ strategy:
292
+ fail-fast: false
293
+ matrix:
294
+ folders: ${{ fromJson(needs.get-tests.outputs.quantizations) }}
295
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
296
+ runs-on:
297
+ group: '${{ matrix.machine_type }}'
298
+ container:
299
+ image: huggingface/transformers-quantization-latest-gpu
300
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
301
+ steps:
302
+ - name: Echo folder ${{ matrix.folders }}
303
+ shell: bash
304
+ run: |
305
+ echo "${{ matrix.folders }}"
306
+ matrix_folders=${{ matrix.folders }}
307
+ matrix_folders=${matrix_folders/'quantization/'/'quantization_'}
308
+ echo "$matrix_folders"
309
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
310
+
311
+ - name: Checkout to PR merge commit
312
+ working-directory: /transformers
313
+ run: |
314
+ git fetch origin refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge:refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
315
+ git checkout refs/remotes/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge
316
+ git log -1 --format=%H
317
+
318
+ - name: Verify merge commit SHA
319
+ env:
320
+ VERIFIED_PR_MERGE_SHA: ${{ needs.get-sha.outputs.PR_MERGE_SHA }}
321
+ working-directory: /transformers
322
+ run: |
323
+ PR_MERGE_SHA=$(git log -1 --format=%H)
324
+ if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
325
+ echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
326
+ exit -1;
327
+ fi
328
+
329
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
330
+ working-directory: /transformers
331
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
332
+ - name: NVIDIA-SMI
333
+ run: |
334
+ nvidia-smi
335
+
336
+ - name: Set `machine_type` for report and artifact names
337
+ working-directory: /transformers
338
+ shell: bash
339
+ run: |
340
+ echo "${{ matrix.machine_type }}"
341
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
342
+ machine_type=single-gpu
343
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
344
+ machine_type=multi-gpu
345
+ else
346
+ machine_type=${{ matrix.machine_type }}
347
+ fi
348
+ echo "$machine_type"
349
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
350
+
351
+ - name: Environment
352
+ working-directory: /transformers
353
+ run: |
354
+ python3 utils/print_env.py
355
+
356
+ - name: Show installed libraries and their versions
357
+ working-directory: /transformers
358
+ run: pip freeze
359
+
360
+ - name: Run quantization tests on GPU
361
+ working-directory: /transformers
362
+ run: |
363
+ python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
364
+
365
+ - name: Failure short reports
366
+ if: ${{ failure() }}
367
+ continue-on-error: true
368
+ run: cat /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
369
+
370
+ - name: Make sure report directory exists
371
+ shell: bash
372
+ run: |
373
+ mkdir -p /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports
374
+ echo "hello" > /transformers/reports/${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports/hello.txt
375
+ echo "${{ env.machine_type }}_run_quantization_gpu_${{ matrix.folders }}_test_reports"
376
+
377
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"
378
+ if: ${{ always() }}
379
+ uses: actions/upload-artifact@v4
380
+ with:
381
+ name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
382
+ path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports
383
+
384
+ update_run_status:
385
+ name: Update Check Run Status
386
+ needs: [get-sha, create_run, run_models_gpu, run_quantization_torch_gpu]
387
+ permissions:
388
+ statuses: write
389
+ if: ${{ always() && needs.create_run.result == 'success' }}
390
+ runs-on: ubuntu-22.04
391
+ env:
392
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
393
+ GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
394
+ STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.run_models_gpu.result) && contains(fromJSON('["skipped", "success"]'), needs.run_quantization_torch_gpu.result) }}
395
+ steps:
396
+ - name: Get `run_models_gpu` job status
397
+ run: |
398
+ echo "${{ needs.run_models_gpu.result }}"
399
+ echo "${{ needs.run_quantization_torch_gpu.result }}"
400
+ echo $STATUS_OK
401
+ if [ "$STATUS_OK" = "true" ]; then
402
+ echo "STATUS=success" >> $GITHUB_ENV
403
+ else
404
+ echo "STATUS=failure" >> $GITHUB_ENV
405
+ fi
406
+
407
+ - name: Update PR commit statuses
408
+ run: |
409
+ echo "${{ needs.run_models_gpu.result }}"
410
+ echo "${{ env.STATUS }}"
411
+ gh api \
412
+ --method POST \
413
+ -H "Accept: application/vnd.github+json" \
414
+ -H "X-GitHub-Api-Version: 2022-11-28" \
415
+ repos/${{ github.repository }}/statuses/${{ needs.get-sha.outputs.PR_HEAD_SHA }} \
416
+ -f "target_url=$GITHUB_RUN_URL" -f "state=${{ env.STATUS }}" -f "description=Slow CI job" -f "context=pytest/custom-tests"
docs/transformers/.github/workflows/self-nightly-past-ci-caller.yml ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (nightly-past-ci-caller)
2
+
3
+ on:
4
+ schedule:
5
+ - cron: "17 2,14 * * *"
6
+ push:
7
+ branches:
8
+ - run_past_ci*
9
+
10
+ jobs:
11
+ get_number:
12
+ name: Get number
13
+ runs-on: ubuntu-22.04
14
+ outputs:
15
+ run_number: ${{ steps.get_number.outputs.run_number }}
16
+ steps:
17
+ - name: Get number
18
+ id: get_number
19
+ run: |
20
+ echo "${{ github.run_number }}"
21
+ echo "$(python3 -c 'print(int(${{ github.run_number }}) % 10)')"
22
+ echo "run_number=$(python3 -c 'print(int(${{ github.run_number }}) % 10)')" >> $GITHUB_OUTPUT
23
+
24
+ run_past_ci_tensorflow_2-11:
25
+ name: TensorFlow 2.11
26
+ needs: get_number
27
+ if: needs.get_number.outputs.run_number == 3 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
28
+ uses: ./.github/workflows/self-past-caller.yml
29
+ with:
30
+ framework: tensorflow
31
+ version: "2.11"
32
+ sha: ${{ github.sha }}
33
+ secrets: inherit
34
+
35
+ run_past_ci_tensorflow_2-10:
36
+ name: TensorFlow 2.10
37
+ needs: get_number
38
+ if: needs.get_number.outputs.run_number == 4 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
39
+ uses: ./.github/workflows/self-past-caller.yml
40
+ with:
41
+ framework: tensorflow
42
+ version: "2.10"
43
+ sha: ${{ github.sha }}
44
+ secrets: inherit
45
+
46
+ run_past_ci_tensorflow_2-9:
47
+ name: TensorFlow 2.9
48
+ needs: get_number
49
+ if: needs.get_number.outputs.run_number == 5 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
50
+ uses: ./.github/workflows/self-past-caller.yml
51
+ with:
52
+ framework: tensorflow
53
+ version: "2.9"
54
+ sha: ${{ github.sha }}
55
+ secrets: inherit
56
+
57
+ run_past_ci_tensorflow_2-8:
58
+ name: TensorFlow 2.8
59
+ needs: get_number
60
+ if: needs.get_number.outputs.run_number == 6 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
61
+ uses: ./.github/workflows/self-past-caller.yml
62
+ with:
63
+ framework: tensorflow
64
+ version: "2.8"
65
+ sha: ${{ github.sha }}
66
+ secrets: inherit
67
+
68
+ run_past_ci_tensorflow_2-7:
69
+ name: TensorFlow 2.7
70
+ needs: get_number
71
+ if: needs.get_number.outputs.run_number == 7 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
72
+ uses: ./.github/workflows/self-past-caller.yml
73
+ with:
74
+ framework: tensorflow
75
+ version: "2.7"
76
+ sha: ${{ github.sha }}
77
+ secrets: inherit
78
+
79
+ run_past_ci_tensorflow_2-6:
80
+ name: TensorFlow 2.6
81
+ needs: get_number
82
+ if: needs.get_number.outputs.run_number == 8 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
83
+ uses: ./.github/workflows/self-past-caller.yml
84
+ with:
85
+ framework: tensorflow
86
+ version: "2.6"
87
+ sha: ${{ github.sha }}
88
+ secrets: inherit
89
+
90
+ run_past_ci_tensorflow_2-5:
91
+ name: TensorFlow 2.5
92
+ needs: get_number
93
+ if: needs.get_number.outputs.run_number == 9 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
94
+ uses: ./.github/workflows/self-past-caller.yml
95
+ with:
96
+ framework: tensorflow
97
+ version: "2.5"
98
+ sha: ${{ github.sha }}
99
+ secrets: inherit
docs/transformers/.github/workflows/self-past-caller.yml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (past-ci)
2
+
3
+
4
+ on:
5
+ workflow_call:
6
+ inputs:
7
+ framework:
8
+ required: true
9
+ type: string
10
+ version:
11
+ required: true
12
+ type: string
13
+ # Use this to control the commit to test against
14
+ sha:
15
+ default: 'main'
16
+ required: false
17
+ type: string
18
+
19
+ jobs:
20
+ model-ci:
21
+ name: Model CI
22
+ uses: ./.github/workflows/self-scheduled.yml
23
+ with:
24
+ job: run_models_gpu
25
+ slack_report_channel: "#transformers-ci-past-future"
26
+ runner: past-ci
27
+ docker: huggingface/transformers-${{ inputs.framework }}-past-${{ inputs.version }}-gpu
28
+ ci_event: Past CI - ${{ inputs.framework }}-${{ inputs.version }}
29
+ secrets: inherit
30
+
31
+ deepspeed-ci:
32
+ name: DeepSpeed CI
33
+ uses: ./.github/workflows/self-scheduled.yml
34
+ with:
35
+ job: run_torch_cuda_extensions_gpu
36
+ slack_report_channel: "#transformers-ci-past-future"
37
+ runner: past-ci
38
+ docker: huggingface/transformers-${{ inputs.framework }}-past-${{ inputs.version }}-gpu
39
+ ci_event: Past CI - ${{ inputs.framework }}-${{ inputs.version }}
40
+ secrets: inherit
docs/transformers/.github/workflows/self-push-amd-mi210-caller.yml ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (AMD mi210 CI caller)
2
+
3
+ on:
4
+ #workflow_run:
5
+ # workflows: ["Self-hosted runner (push-caller)"]
6
+ # branches: ["main"]
7
+ # types: [completed]
8
+ push:
9
+ branches:
10
+ - run_amd_push_ci_caller*
11
+ paths:
12
+ - "src/**"
13
+ - "tests/**"
14
+ - ".github/**"
15
+ - "templates/**"
16
+ - "utils/**"
17
+
18
+ jobs:
19
+ run_amd_ci:
20
+ name: AMD mi210
21
+ if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
22
+ uses: ./.github/workflows/self-push-amd.yml
23
+ with:
24
+ gpu_flavor: mi210
25
+ secrets: inherit
docs/transformers/.github/workflows/self-push-amd-mi300-caller.yml ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (AMD mi300 CI caller)
2
+
3
+ on:
4
+ #workflow_run:
5
+ # workflows: ["Self-hosted runner (push-caller)"]
6
+ # branches: ["main"]
7
+ # types: [completed]
8
+ push:
9
+ branches:
10
+ - run_amd_push_ci_caller*
11
+ paths:
12
+ - "src/**"
13
+ - "tests/**"
14
+ - ".github/**"
15
+ - "templates/**"
16
+ - "utils/**"
17
+
18
+ jobs:
19
+ run_amd_ci:
20
+ name: AMD mi300
21
+ if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && (startsWith(github.ref_name, 'run_amd_push_ci_caller') || startsWith(github.ref_name, 'mi300-ci'))))
22
+ uses: ./.github/workflows/self-push-amd.yml
23
+ with:
24
+ gpu_flavor: mi300
25
+ secrets: inherit
docs/transformers/.github/workflows/self-push-amd.yml ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner AMD GPU (push)
2
+
3
+ on:
4
+ workflow_call:
5
+ inputs:
6
+ gpu_flavor:
7
+ required: true
8
+ type: string
9
+
10
+ env:
11
+ HF_HOME: /mnt/cache
12
+ TRANSFORMERS_IS_CI: yes
13
+ OMP_NUM_THREADS: 8
14
+ MKL_NUM_THREADS: 8
15
+ PYTEST_TIMEOUT: 60
16
+ TF_FORCE_GPU_ALLOW_GROWTH: true
17
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
18
+
19
+ jobs:
20
+ check_runner_status:
21
+ name: Check Runner Status
22
+ runs-on: ubuntu-22.04
23
+ steps:
24
+ - name: Checkout transformers
25
+ uses: actions/checkout@v4
26
+ with:
27
+ fetch-depth: 2
28
+
29
+ - name: Check Runner Status
30
+ run: python utils/check_self_hosted_runner.py --target_runners amd-mi210-single-gpu-ci-runner-docker --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
31
+
32
+ check_runners:
33
+ name: Check Runners
34
+ needs: check_runner_status
35
+ strategy:
36
+ matrix:
37
+ machine_type: [single-gpu, multi-gpu]
38
+ runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
39
+ container:
40
+ image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
41
+ options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
42
+ steps:
43
+ - name: ROCM-SMI
44
+ run: |
45
+ rocm-smi
46
+ - name: ROCM-INFO
47
+ run: |
48
+ rocminfo | grep "Agent" -A 14
49
+ - name: Show ROCR environment
50
+ run: |
51
+ echo "ROCR: $ROCR_VISIBLE_DEVICES"
52
+
53
+ setup_gpu:
54
+ name: Setup
55
+ needs: check_runners
56
+ strategy:
57
+ matrix:
58
+ machine_type: [single-gpu, multi-gpu]
59
+ runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
60
+ container:
61
+ image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
62
+ options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
63
+ outputs:
64
+ matrix: ${{ steps.set-matrix.outputs.matrix }}
65
+ test_map: ${{ steps.set-matrix.outputs.test_map }}
66
+ env:
67
+ # `CI_BRANCH_PUSH`: The branch name from the push event
68
+ # `CI_BRANCH_WORKFLOW_RUN`: The name of the branch on which this workflow is triggered by `workflow_run` event
69
+ # `CI_SHA_PUSH`: The commit SHA from the push event
70
+ # `CI_SHA_WORKFLOW_RUN`: The commit SHA that triggers this workflow by `workflow_run` event
71
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
72
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
73
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
74
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
75
+ steps:
76
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
77
+ # We also take into account the `push` event (we might want to test some changes in a branch)
78
+ - name: Prepare custom environment variables
79
+ shell: bash
80
+ # `CI_BRANCH`: The non-empty branch name from the above two (one and only one of them is empty)
81
+ # `CI_SHA`: The non-empty commit SHA from the above two (one and only one of them is empty)
82
+ run: |
83
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
84
+ echo $CI_BRANCH_PUSH
85
+ echo $CI_BRANCH_WORKFLOW_RUN
86
+ echo $CI_SHA_PUSH
87
+ echo $CI_SHA_WORKFLOW_RUN
88
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
89
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
90
+
91
+ - name: print environment variables
92
+ run: |
93
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
94
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
95
+
96
+ - name: Update clone using environment variables
97
+ working-directory: /transformers
98
+ run: |
99
+ echo "original branch = $(git branch --show-current)"
100
+ git fetch && git checkout ${{ env.CI_BRANCH }}
101
+ echo "updated branch = $(git branch --show-current)"
102
+ git checkout ${{ env.CI_SHA }}
103
+ echo "log = $(git log -n 1)"
104
+
105
+ - name: Cleanup
106
+ working-directory: /transformers
107
+ run: |
108
+ rm -rf tests/__pycache__
109
+ rm -rf tests/models/__pycache__
110
+ rm -rf reports
111
+
112
+ - name: Show installed libraries and their versions
113
+ working-directory: /transformers
114
+ run: pip freeze
115
+
116
+ - name: Fetch the tests to run
117
+ working-directory: /transformers
118
+ # TODO: add `git-python` in the docker images
119
+ run: |
120
+ pip install --upgrade git-python
121
+ python3 utils/tests_fetcher.py --diff_with_last_commit | tee test_preparation.txt
122
+
123
+ - name: Report fetched tests
124
+ uses: actions/upload-artifact@v4
125
+ with:
126
+ name: test_fetched
127
+ path: /transformers/test_preparation.txt
128
+
129
+ - id: set-matrix
130
+ name: Organize tests into models
131
+ working-directory: /transformers
132
+ # The `keys` is used as GitHub actions matrix for jobs, i.e. `models/bert`, `tokenization`, `pipeline`, etc.
133
+ # The `test_map` is used to get the actual identified test files under each key.
134
+ # If no test to run (so no `test_map.json` file), create a dummy map (empty matrix will fail)
135
+ run: |
136
+ if [ -f test_map.json ]; then
137
+ keys=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); d = list(test_map.keys()); print(d)')
138
+ test_map=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); print(test_map)')
139
+ else
140
+ keys=$(python3 -c 'keys = ["dummy"]; print(keys)')
141
+ test_map=$(python3 -c 'test_map = {"dummy": []}; print(test_map)')
142
+ fi
143
+ echo $keys
144
+ echo $test_map
145
+ echo "matrix=$keys" >> $GITHUB_OUTPUT
146
+ echo "test_map=$test_map" >> $GITHUB_OUTPUT
147
+
148
+ run_models_gpu:
149
+ name: Model tests
150
+ needs: setup_gpu
151
+ # `dummy` means there is no test to run
152
+ if: contains(fromJson(needs.setup_gpu.outputs.matrix), 'dummy') != true
153
+ strategy:
154
+ fail-fast: false
155
+ matrix:
156
+ folders: ${{ fromJson(needs.setup_gpu.outputs.matrix) }}
157
+ machine_type: [single-gpu, multi-gpu]
158
+ runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
159
+ container:
160
+ image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
161
+ options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
162
+ env:
163
+ # For the meaning of these environment variables, see the job `Setup`
164
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
165
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
166
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
167
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
168
+ steps:
169
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
170
+ # We also take into account the `push` event (we might want to test some changes in a branch)
171
+ - name: Prepare custom environment variables
172
+ shell: bash
173
+ # For the meaning of these environment variables, see the job `Setup`
174
+ run: |
175
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
176
+ echo $CI_BRANCH_PUSH
177
+ echo $CI_BRANCH_WORKFLOW_RUN
178
+ echo $CI_SHA_PUSH
179
+ echo $CI_SHA_WORKFLOW_RUN
180
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
181
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
182
+
183
+ - name: print environment variables
184
+ run: |
185
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
186
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
187
+
188
+ - name: Update clone using environment variables
189
+ working-directory: /transformers
190
+ run: |
191
+ echo "original branch = $(git branch --show-current)"
192
+ git fetch && git checkout ${{ env.CI_BRANCH }}
193
+ echo "updated branch = $(git branch --show-current)"
194
+ git checkout ${{ env.CI_SHA }}
195
+ echo "log = $(git log -n 1)"
196
+
197
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
198
+ working-directory: /transformers
199
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
200
+
201
+ - name: Echo folder ${{ matrix.folders }}
202
+ shell: bash
203
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
204
+ # set the artifact folder names (because the character `/` is not allowed).
205
+ run: |
206
+ echo "${{ matrix.folders }}"
207
+ echo "${{ fromJson(needs.setup_gpu.outputs.test_map)[matrix.folders] }}"
208
+ matrix_folders=${{ matrix.folders }}
209
+ matrix_folders=${matrix_folders/'models/'/'models_'}
210
+ echo "$matrix_folders"
211
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
212
+
213
+ - name: ROCM-SMI
214
+ run: |
215
+ rocm-smi
216
+ - name: ROCM-INFO
217
+ run: |
218
+ rocminfo | grep "Agent" -A 14
219
+ - name: Show ROCR environment
220
+ run: |
221
+ echo "ROCR: $ROCR_VISIBLE_DEVICES"
222
+
223
+ - name: Environment
224
+ working-directory: /transformers
225
+ run: |
226
+ python3 utils/print_env.py
227
+
228
+ - name: Show installed libraries and their versions
229
+ working-directory: /transformers
230
+ run: pip freeze
231
+
232
+ - name: Run all non-slow selected tests on GPU
233
+ working-directory: /transformers
234
+ run: |
235
+ python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports ${{ fromJson(needs.setup_gpu.outputs.test_map)[matrix.folders] }} -m "not not_device_test"
236
+
237
+ - name: Failure short reports
238
+ if: ${{ failure() }}
239
+ continue-on-error: true
240
+ run: cat /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
241
+
242
+ - name: "Test suite reports artifacts: ${{ matrix.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports"
243
+ if: ${{ always() }}
244
+ uses: actions/upload-artifact@v4
245
+ with:
246
+ name: ${{ matrix.machine_type }}_run_models_gpu_${{ env.matrix_folders }}_test_reports
247
+ path: /transformers/reports/${{ matrix.machine_type }}_run_models_gpu_${{ matrix.folders }}_test_reports
248
+
249
+ send_results:
250
+ name: Send results to webhook
251
+ runs-on: ubuntu-22.04
252
+ if: always()
253
+ needs: [
254
+ check_runner_status,
255
+ check_runners,
256
+ setup_gpu,
257
+ run_models_gpu,
258
+ # run_tests_torch_cuda_extensions_single_gpu,
259
+ # run_tests_torch_cuda_extensions_multi_gpu
260
+ ]
261
+ env:
262
+ # For the meaning of these environment variables, see the job `Setup`
263
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
264
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
265
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
266
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
267
+ steps:
268
+ - name: Preliminary job status
269
+ shell: bash
270
+ # For the meaning of these environment variables, see the job `Setup`
271
+ run: |
272
+ echo "Runner availability: ${{ needs.check_runner_status.result }}"
273
+ echo "Setup status: ${{ needs.setup_gpu.result }}"
274
+ echo "Runner status: ${{ needs.check_runners.result }}"
275
+
276
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
277
+ # We also take into account the `push` event (we might want to test some changes in a branch)
278
+ - name: Prepare custom environment variables
279
+ shell: bash
280
+ # For the meaning of these environment variables, see the job `Setup`
281
+ run: |
282
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
283
+ echo $CI_BRANCH_PUSH
284
+ echo $CI_BRANCH_WORKFLOW_RUN
285
+ echo $CI_SHA_PUSH
286
+ echo $CI_SHA_WORKFLOW_RUN
287
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
288
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
289
+
290
+ - name: print environment variables
291
+ run: |
292
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
293
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
294
+
295
+ - uses: actions/checkout@v4
296
+ # To avoid failure when multiple commits are merged into `main` in a short period of time.
297
+ # Checking out to an old commit beyond the fetch depth will get an error `fatal: reference is not a tree: ...
298
+ # (Only required for `workflow_run` event, where we get the latest HEAD on `main` instead of the event commit)
299
+ with:
300
+ fetch-depth: 20
301
+
302
+ - name: Update clone using environment variables
303
+ run: |
304
+ echo "original branch = $(git branch --show-current)"
305
+ git fetch && git checkout ${{ env.CI_BRANCH }}
306
+ echo "updated branch = $(git branch --show-current)"
307
+ git checkout ${{ env.CI_SHA }}
308
+ echo "log = $(git log -n 1)"
309
+
310
+ - uses: actions/download-artifact@v4
311
+ - name: Send message to Slack
312
+ env:
313
+ CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
314
+ CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
315
+ CI_SLACK_CHANNEL_ID_DAILY: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY }}
316
+ CI_SLACK_CHANNEL_ID_AMD: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
317
+ CI_SLACK_CHANNEL_DUMMY_TESTS: ${{ secrets.CI_SLACK_CHANNEL_DUMMY_TESTS }}
318
+ CI_SLACK_REPORT_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
319
+ ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
320
+ CI_EVENT: Push CI (AMD) - ${{ inputs.gpu_flavor }}
321
+ CI_TITLE_PUSH: ${{ github.event.head_commit.message }}
322
+ CI_TITLE_WORKFLOW_RUN: ${{ github.event.workflow_run.head_commit.message }}
323
+ CI_SHA: ${{ env.CI_SHA }}
324
+ RUNNER_STATUS: ${{ needs.check_runner_status.result }}
325
+ RUNNER_ENV_STATUS: ${{ needs.check_runners.result }}
326
+ SETUP_STATUS: ${{ needs.setup_gpu.result }}
327
+
328
+ # We pass `needs.setup_gpu.outputs.matrix` as the argument. A processing in `notification_service.py` to change
329
+ # `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
330
+ run: |
331
+ pip install huggingface_hub
332
+ pip install slack_sdk
333
+ pip show slack_sdk
334
+ python utils/notification_service.py "${{ needs.setup_gpu.outputs.matrix }}"
docs/transformers/.github/workflows/self-push.yml ADDED
@@ -0,0 +1,652 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (push)
2
+
3
+ on:
4
+ workflow_run:
5
+ workflows: ["Self-hosted runner (push-caller)"]
6
+ branches: ["main"]
7
+ types: [completed]
8
+ push:
9
+ branches:
10
+ - ci_*
11
+ - ci-*
12
+ paths:
13
+ - "src/**"
14
+ - "tests/**"
15
+ - ".github/**"
16
+ - "templates/**"
17
+ - "utils/**"
18
+ repository_dispatch:
19
+
20
+ env:
21
+ HF_HOME: /mnt/cache
22
+ TRANSFORMERS_IS_CI: yes
23
+ OMP_NUM_THREADS: 8
24
+ MKL_NUM_THREADS: 8
25
+ PYTEST_TIMEOUT: 60
26
+ TF_FORCE_GPU_ALLOW_GROWTH: true
27
+ CUDA_VISIBLE_DEVICES: 0,1
28
+
29
+ jobs:
30
+ setup:
31
+ name: Setup
32
+ strategy:
33
+ matrix:
34
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
35
+ runs-on:
36
+ group: '${{ matrix.machine_type }}'
37
+ container:
38
+ image: huggingface/transformers-all-latest-gpu-push-ci
39
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
40
+ outputs:
41
+ matrix: ${{ steps.set-matrix.outputs.matrix }}
42
+ test_map: ${{ steps.set-matrix.outputs.test_map }}
43
+ env:
44
+ # `CI_BRANCH_PUSH`: The branch name from the push event
45
+ # `CI_BRANCH_WORKFLOW_RUN`: The name of the branch on which this workflow is triggered by `workflow_run` event
46
+ # `CI_SHA_PUSH`: The commit SHA from the push event
47
+ # `CI_SHA_WORKFLOW_RUN`: The commit SHA that triggers this workflow by `workflow_run` event
48
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
49
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
50
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
51
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
52
+ steps:
53
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
54
+ # We also take into account the `push` event (we might want to test some changes in a branch)
55
+ - name: Prepare custom environment variables
56
+ shell: bash
57
+ # `CI_BRANCH`: The non-empty branch name from the above two (one and only one of them is empty)
58
+ # `CI_SHA`: The non-empty commit SHA from the above two (one and only one of them is empty)
59
+ run: |
60
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
61
+ echo $CI_BRANCH_PUSH
62
+ echo $CI_BRANCH_WORKFLOW_RUN
63
+ echo $CI_SHA_PUSH
64
+ echo $CI_SHA_WORKFLOW_RUN
65
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
66
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
67
+
68
+ - name: print environment variables
69
+ run: |
70
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
71
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
72
+
73
+ - name: Update clone using environment variables
74
+ working-directory: /transformers
75
+ run: |
76
+ echo "original branch = $(git branch --show-current)"
77
+ git fetch && git checkout ${{ env.CI_BRANCH }}
78
+ echo "updated branch = $(git branch --show-current)"
79
+ git checkout ${{ env.CI_SHA }}
80
+ echo "log = $(git log -n 1)"
81
+
82
+ - name: Cleanup
83
+ working-directory: /transformers
84
+ run: |
85
+ rm -rf tests/__pycache__
86
+ rm -rf tests/models/__pycache__
87
+ rm -rf reports
88
+
89
+ - name: Show installed libraries and their versions
90
+ working-directory: /transformers
91
+ run: pip freeze
92
+
93
+ - name: Fetch the tests to run
94
+ working-directory: /transformers
95
+ # TODO: add `git-python` in the docker images
96
+ run: |
97
+ pip install --upgrade git-python
98
+ python3 utils/tests_fetcher.py --diff_with_last_commit | tee test_preparation.txt
99
+
100
+ - name: Report fetched tests
101
+ uses: actions/upload-artifact@v4
102
+ with:
103
+ name: test_fetched
104
+ path: /transformers/test_preparation.txt
105
+
106
+ - id: set-matrix
107
+ name: Organize tests into models
108
+ working-directory: /transformers
109
+ # The `keys` is used as GitHub actions matrix for jobs, i.e. `models/bert`, `tokenization`, `pipeline`, etc.
110
+ # The `test_map` is used to get the actual identified test files under each key.
111
+ # If no test to run (so no `test_map.json` file), create a dummy map (empty matrix will fail)
112
+ run: |
113
+ if [ -f test_map.json ]; then
114
+ keys=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); d = list(test_map.keys()); print(d)')
115
+ test_map=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); print(test_map)')
116
+ else
117
+ keys=$(python3 -c 'keys = ["dummy"]; print(keys)')
118
+ test_map=$(python3 -c 'test_map = {"dummy": []}; print(test_map)')
119
+ fi
120
+ echo $keys
121
+ echo $test_map
122
+ echo "matrix=$keys" >> $GITHUB_OUTPUT
123
+ echo "test_map=$test_map" >> $GITHUB_OUTPUT
124
+
125
+ run_tests_single_gpu:
126
+ name: Model tests
127
+ needs: setup
128
+ # `dummy` means there is no test to run
129
+ if: contains(fromJson(needs.setup.outputs.matrix), 'dummy') != true
130
+ strategy:
131
+ fail-fast: false
132
+ matrix:
133
+ folders: ${{ fromJson(needs.setup.outputs.matrix) }}
134
+ machine_type: [aws-g4dn-2xlarge-cache]
135
+ runs-on:
136
+ group: '${{ matrix.machine_type }}'
137
+ container:
138
+ image: huggingface/transformers-all-latest-gpu-push-ci
139
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
140
+ env:
141
+ # For the meaning of these environment variables, see the job `Setup`
142
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
143
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
144
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
145
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
146
+ steps:
147
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
148
+ # We also take into account the `push` event (we might want to test some changes in a branch)
149
+ - name: Prepare custom environment variables
150
+ shell: bash
151
+ # For the meaning of these environment variables, see the job `Setup`
152
+ run: |
153
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
154
+ echo $CI_BRANCH_PUSH
155
+ echo $CI_BRANCH_WORKFLOW_RUN
156
+ echo $CI_SHA_PUSH
157
+ echo $CI_SHA_WORKFLOW_RUN
158
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
159
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
160
+
161
+ - name: print environment variables
162
+ run: |
163
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
164
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
165
+
166
+ - name: Set `machine_type` for report and artifact names
167
+ working-directory: /transformers
168
+ shell: bash
169
+ run: |
170
+ echo "${{ matrix.machine_type }}"
171
+
172
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
173
+ machine_type=single-gpu
174
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
175
+ machine_type=multi-gpu
176
+ else
177
+ machine_type=${{ matrix.machine_type }}
178
+ fi
179
+
180
+ echo "$machine_type"
181
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
182
+
183
+ - name: Update clone using environment variables
184
+ working-directory: /transformers
185
+ run: |
186
+ echo "original branch = $(git branch --show-current)"
187
+ git fetch && git checkout ${{ env.CI_BRANCH }}
188
+ echo "updated branch = $(git branch --show-current)"
189
+ git checkout ${{ env.CI_SHA }}
190
+ echo "log = $(git log -n 1)"
191
+
192
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
193
+ working-directory: /transformers
194
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
195
+
196
+ - name: Echo folder ${{ matrix.folders }}
197
+ shell: bash
198
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
199
+ # set the artifact folder names (because the character `/` is not allowed).
200
+ run: |
201
+ echo "${{ matrix.folders }}"
202
+ echo "${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}"
203
+ matrix_folders=${{ matrix.folders }}
204
+ matrix_folders=${matrix_folders/'models/'/'models_'}
205
+ echo "$matrix_folders"
206
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
207
+
208
+ - name: NVIDIA-SMI
209
+ run: |
210
+ nvidia-smi
211
+
212
+ - name: Environment
213
+ working-directory: /transformers
214
+ run: |
215
+ python3 utils/print_env.py
216
+
217
+ - name: Show installed libraries and their versions
218
+ working-directory: /transformers
219
+ run: pip freeze
220
+
221
+ - name: Run all non-slow selected tests on GPU
222
+ working-directory: /transformers
223
+ run: |
224
+ python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ env.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}
225
+
226
+ - name: Failure short reports
227
+ if: ${{ failure() }}
228
+ continue-on-error: true
229
+ run: cat /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt
230
+
231
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"
232
+ if: ${{ always() }}
233
+ uses: actions/upload-artifact@v4
234
+ with:
235
+ name: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports
236
+ path: /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}
237
+
238
+ run_tests_multi_gpu:
239
+ name: Model tests
240
+ needs: setup
241
+ # `dummy` means there is no test to run
242
+ if: contains(fromJson(needs.setup.outputs.matrix), 'dummy') != true
243
+ strategy:
244
+ fail-fast: false
245
+ matrix:
246
+ folders: ${{ fromJson(needs.setup.outputs.matrix) }}
247
+ machine_type: [aws-g4dn-12xlarge-cache]
248
+ runs-on:
249
+ group: '${{ matrix.machine_type }}'
250
+ container:
251
+ image: huggingface/transformers-all-latest-gpu-push-ci
252
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
253
+ env:
254
+ # For the meaning of these environment variables, see the job `Setup`
255
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
256
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
257
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
258
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
259
+ steps:
260
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
261
+ # We also take into account the `push` event (we might want to test some changes in a branch)
262
+ - name: Prepare custom environment variables
263
+ shell: bash
264
+ # For the meaning of these environment variables, see the job `Setup`
265
+ run: |
266
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
267
+ echo $CI_BRANCH_PUSH
268
+ echo $CI_BRANCH_WORKFLOW_RUN
269
+ echo $CI_SHA_PUSH
270
+ echo $CI_SHA_WORKFLOW_RUN
271
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
272
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
273
+
274
+ - name: print environment variables
275
+ run: |
276
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
277
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
278
+
279
+ - name: Set `machine_type` for report and artifact names
280
+ working-directory: /transformers
281
+ shell: bash
282
+ run: |
283
+ echo "${{ matrix.machine_type }}"
284
+
285
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
286
+ machine_type=single-gpu
287
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
288
+ machine_type=multi-gpu
289
+ else
290
+ machine_type=${{ matrix.machine_type }}
291
+ fi
292
+
293
+ echo "$machine_type"
294
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
295
+
296
+ - name: Update clone using environment variables
297
+ working-directory: /transformers
298
+ run: |
299
+ echo "original branch = $(git branch --show-current)"
300
+ git fetch && git checkout ${{ env.CI_BRANCH }}
301
+ echo "updated branch = $(git branch --show-current)"
302
+ git checkout ${{ env.CI_SHA }}
303
+ echo "log = $(git log -n 1)"
304
+
305
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
306
+ working-directory: /transformers
307
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
308
+
309
+ - name: Echo folder ${{ matrix.folders }}
310
+ shell: bash
311
+ # For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
312
+ # set the artifact folder names (because the character `/` is not allowed).
313
+ run: |
314
+ echo "${{ matrix.folders }}"
315
+ echo "${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}"
316
+ matrix_folders=${{ matrix.folders }}
317
+ matrix_folders=${matrix_folders/'models/'/'models_'}
318
+ echo "$matrix_folders"
319
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
320
+
321
+ - name: NVIDIA-SMI
322
+ run: |
323
+ nvidia-smi
324
+
325
+ - name: Environment
326
+ working-directory: /transformers
327
+ run: |
328
+ python3 utils/print_env.py
329
+
330
+ - name: Show installed libraries and their versions
331
+ working-directory: /transformers
332
+ run: pip freeze
333
+
334
+ - name: Run all non-slow selected tests on GPU
335
+ env:
336
+ MKL_SERVICE_FORCE_INTEL: 1
337
+ working-directory: /transformers
338
+ run: |
339
+ python3 -m pytest -n 2 --dist=loadfile -v --make-reports=${{ env.machine_type }}_tests_gpu_${{ matrix.folders }} ${{ fromJson(needs.setup.outputs.test_map)[matrix.folders] }}
340
+
341
+ - name: Failure short reports
342
+ if: ${{ failure() }}
343
+ continue-on-error: true
344
+ run: cat /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}/failures_short.txt
345
+
346
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports"
347
+ if: ${{ always() }}
348
+ uses: actions/upload-artifact@v4
349
+ with:
350
+ name: ${{ env.machine_type }}_run_all_tests_gpu_${{ env.matrix_folders }}_test_reports
351
+ path: /transformers/reports/${{ env.machine_type }}_tests_gpu_${{ matrix.folders }}
352
+
353
+ run_tests_torch_cuda_extensions_single_gpu:
354
+ name: Torch CUDA extension tests
355
+ needs: setup
356
+ if: contains(fromJson(needs.setup.outputs.matrix), 'deepspeed') || contains(fromJson(needs.setup.outputs.matrix), 'extended')
357
+ strategy:
358
+ fail-fast: false
359
+ matrix:
360
+ machine_type: [aws-g4dn-2xlarge-cache]
361
+ runs-on:
362
+ group: '${{ matrix.machine_type }}'
363
+ container:
364
+ image: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci
365
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
366
+ env:
367
+ # For the meaning of these environment variables, see the job `Setup`
368
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
369
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
370
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
371
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
372
+ steps:
373
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
374
+ # We also take into account the `push` event (we might want to test some changes in a branch)
375
+ - name: Prepare custom environment variables
376
+ shell: bash
377
+ # For the meaning of these environment variables, see the job `Setup`
378
+ run: |
379
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
380
+ echo $CI_BRANCH_PUSH
381
+ echo $CI_BRANCH_WORKFLOW_RUN
382
+ echo $CI_SHA_PUSH
383
+ echo $CI_SHA_WORKFLOW_RUN
384
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
385
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
386
+
387
+ - name: print environment variables
388
+ run: |
389
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
390
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
391
+
392
+ - name: Set `machine_type` for report and artifact names
393
+ working-directory: /workspace/transformers
394
+ shell: bash
395
+ run: |
396
+ echo "${{ matrix.machine_type }}"
397
+
398
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
399
+ machine_type=single-gpu
400
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
401
+ machine_type=multi-gpu
402
+ else
403
+ machine_type=${{ matrix.machine_type }}
404
+ fi
405
+
406
+ echo "$machine_type"
407
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
408
+
409
+ - name: Update clone using environment variables
410
+ working-directory: /workspace/transformers
411
+ run: |
412
+ echo "original branch = $(git branch --show-current)"
413
+ git fetch && git checkout ${{ env.CI_BRANCH }}
414
+ echo "updated branch = $(git branch --show-current)"
415
+ git checkout ${{ env.CI_SHA }}
416
+ echo "log = $(git log -n 1)"
417
+
418
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
419
+ working-directory: /workspace/transformers
420
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
421
+
422
+ - name: Remove cached torch extensions
423
+ run: rm -rf /github/home/.cache/torch_extensions/
424
+
425
+ # To avoid unknown test failures
426
+ - name: Pre build DeepSpeed *again*
427
+ working-directory: /workspace
428
+ run: |
429
+ python3 -m pip uninstall -y deepspeed
430
+ DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
431
+
432
+ - name: NVIDIA-SMI
433
+ run: |
434
+ nvidia-smi
435
+
436
+ - name: Environment
437
+ working-directory: /workspace/transformers
438
+ run: |
439
+ python utils/print_env.py
440
+
441
+ - name: Show installed libraries and their versions
442
+ working-directory: /workspace/transformers
443
+ run: pip freeze
444
+
445
+ - name: Run all non-slow selected tests on GPU
446
+ working-directory: /workspace/transformers
447
+ # TODO: Here we pass all tests in the 2 folders for simplicity. It's better to pass only the identified tests.
448
+ run: |
449
+ python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended
450
+
451
+ - name: Failure short reports
452
+ if: ${{ failure() }}
453
+ continue-on-error: true
454
+ run: cat /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt
455
+
456
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
457
+ if: ${{ always() }}
458
+ uses: actions/upload-artifact@v4
459
+ with:
460
+ name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
461
+ path: /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
462
+
463
+ run_tests_torch_cuda_extensions_multi_gpu:
464
+ name: Torch CUDA extension tests
465
+ needs: setup
466
+ if: contains(fromJson(needs.setup.outputs.matrix), 'deepspeed') || contains(fromJson(needs.setup.outputs.matrix), 'extended')
467
+ strategy:
468
+ fail-fast: false
469
+ matrix:
470
+ machine_type: [aws-g4dn-12xlarge-cache]
471
+ runs-on:
472
+ group: '${{ matrix.machine_type }}'
473
+ container:
474
+ image: huggingface/transformers-pytorch-deepspeed-latest-gpu-push-ci
475
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
476
+ env:
477
+ # For the meaning of these environment variables, see the job `Setup`
478
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
479
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
480
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
481
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
482
+ steps:
483
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
484
+ # We also take into account the `push` event (we might want to test some changes in a branch)
485
+ - name: Prepare custom environment variables
486
+ shell: bash
487
+ # For the meaning of these environment variables, see the job `Setup`
488
+ run: |
489
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
490
+ echo $CI_BRANCH_PUSH
491
+ echo $CI_BRANCH_WORKFLOW_RUN
492
+ echo $CI_SHA_PUSH
493
+ echo $CI_SHA_WORKFLOW_RUN
494
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
495
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
496
+
497
+ - name: print environment variables
498
+ run: |
499
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
500
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
501
+
502
+ - name: Set `machine_type` for report and artifact names
503
+ working-directory: /workspace/transformers
504
+ shell: bash
505
+ run: |
506
+ echo "${{ matrix.machine_type }}"
507
+
508
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
509
+ machine_type=single-gpu
510
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
511
+ machine_type=multi-gpu
512
+ else
513
+ machine_type=${{ matrix.machine_type }}
514
+ fi
515
+
516
+ echo "$machine_type"
517
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
518
+
519
+ - name: Update clone using environment variables
520
+ working-directory: /workspace/transformers
521
+ run: |
522
+ echo "original branch = $(git branch --show-current)"
523
+ git fetch && git checkout ${{ env.CI_BRANCH }}
524
+ echo "updated branch = $(git branch --show-current)"
525
+ git checkout ${{ env.CI_SHA }}
526
+ echo "log = $(git log -n 1)"
527
+
528
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
529
+ working-directory: /workspace/transformers
530
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
531
+
532
+ - name: Remove cached torch extensions
533
+ run: rm -rf /github/home/.cache/torch_extensions/
534
+
535
+ # To avoid unknown test failures
536
+ - name: Pre build DeepSpeed *again*
537
+ working-directory: /workspace
538
+ run: |
539
+ python3 -m pip uninstall -y deepspeed
540
+ DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
541
+
542
+ - name: NVIDIA-SMI
543
+ run: |
544
+ nvidia-smi
545
+
546
+ - name: Environment
547
+ working-directory: /workspace/transformers
548
+ run: |
549
+ python utils/print_env.py
550
+
551
+ - name: Show installed libraries and their versions
552
+ working-directory: /workspace/transformers
553
+ run: pip freeze
554
+
555
+ - name: Run all non-slow selected tests on GPU
556
+ working-directory: /workspace/transformers
557
+ # TODO: Here we pass all tests in the 2 folders for simplicity. It's better to pass only the identified tests.
558
+ run: |
559
+ python -m pytest -n 1 --dist=loadfile -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended
560
+
561
+ - name: Failure short reports
562
+ if: ${{ failure() }}
563
+ continue-on-error: true
564
+ run: cat /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt
565
+
566
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
567
+ if: ${{ always() }}
568
+ uses: actions/upload-artifact@v4
569
+ with:
570
+ name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
571
+ path: /workspace/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
572
+
573
+ send_results:
574
+ name: Send results to webhook
575
+ runs-on: ubuntu-22.04
576
+ if: always()
577
+ needs: [
578
+ setup,
579
+ run_tests_single_gpu,
580
+ run_tests_multi_gpu,
581
+ run_tests_torch_cuda_extensions_single_gpu,
582
+ run_tests_torch_cuda_extensions_multi_gpu
583
+ ]
584
+ env:
585
+ # For the meaning of these environment variables, see the job `Setup`
586
+ CI_BRANCH_PUSH: ${{ github.event.ref }}
587
+ CI_BRANCH_WORKFLOW_RUN: ${{ github.event.workflow_run.head_branch }}
588
+ CI_SHA_PUSH: ${{ github.event.head_commit.id }}
589
+ CI_SHA_WORKFLOW_RUN: ${{ github.event.workflow_run.head_sha }}
590
+ steps:
591
+ - name: Preliminary job status
592
+ shell: bash
593
+ # For the meaning of these environment variables, see the job `Setup`
594
+ run: |
595
+ echo "Setup status: ${{ needs.setup.result }}"
596
+
597
+ # Necessary to get the correct branch name and commit SHA for `workflow_run` event
598
+ # We also take into account the `push` event (we might want to test some changes in a branch)
599
+ - name: Prepare custom environment variables
600
+ shell: bash
601
+ # For the meaning of these environment variables, see the job `Setup`
602
+ run: |
603
+ CI_BRANCH_PUSH=${CI_BRANCH_PUSH/'refs/heads/'/''}
604
+ echo $CI_BRANCH_PUSH
605
+ echo $CI_BRANCH_WORKFLOW_RUN
606
+ echo $CI_SHA_PUSH
607
+ echo $CI_SHA_WORKFLOW_RUN
608
+ [[ ! -z "$CI_BRANCH_PUSH" ]] && echo "CI_BRANCH=$CI_BRANCH_PUSH" >> $GITHUB_ENV || echo "CI_BRANCH=$CI_BRANCH_WORKFLOW_RUN" >> $GITHUB_ENV
609
+ [[ ! -z "$CI_SHA_PUSH" ]] && echo "CI_SHA=$CI_SHA_PUSH" >> $GITHUB_ENV || echo "CI_SHA=$CI_SHA_WORKFLOW_RUN" >> $GITHUB_ENV
610
+
611
+ - name: print environment variables
612
+ run: |
613
+ echo "env.CI_BRANCH = ${{ env.CI_BRANCH }}"
614
+ echo "env.CI_SHA = ${{ env.CI_SHA }}"
615
+
616
+ - uses: actions/checkout@v4
617
+ # To avoid failure when multiple commits are merged into `main` in a short period of time.
618
+ # Checking out to an old commit beyond the fetch depth will get an error `fatal: reference is not a tree: ...
619
+ # (Only required for `workflow_run` event, where we get the latest HEAD on `main` instead of the event commit)
620
+ with:
621
+ fetch-depth: 20
622
+
623
+ - name: Update clone using environment variables
624
+ run: |
625
+ echo "original branch = $(git branch --show-current)"
626
+ git fetch && git checkout ${{ env.CI_BRANCH }}
627
+ echo "updated branch = $(git branch --show-current)"
628
+ git checkout ${{ env.CI_SHA }}
629
+ echo "log = $(git log -n 1)"
630
+
631
+ - uses: actions/download-artifact@v4
632
+ - name: Send message to Slack
633
+ env:
634
+ CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
635
+ CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
636
+ CI_SLACK_CHANNEL_ID_DAILY: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY }}
637
+ CI_SLACK_CHANNEL_DUMMY_TESTS: ${{ secrets.CI_SLACK_CHANNEL_DUMMY_TESTS }}
638
+ CI_SLACK_REPORT_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
639
+ ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
640
+ CI_EVENT: push
641
+ CI_TITLE_PUSH: ${{ github.event.head_commit.message }}
642
+ CI_TITLE_WORKFLOW_RUN: ${{ github.event.workflow_run.head_commit.message }}
643
+ CI_SHA: ${{ env.CI_SHA }}
644
+ SETUP_STATUS: ${{ needs.setup.result }}
645
+
646
+ # We pass `needs.setup.outputs.matrix` as the argument. A processing in `notification_service.py` to change
647
+ # `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
648
+ run: |
649
+ pip install huggingface_hub
650
+ pip install slack_sdk
651
+ pip show slack_sdk
652
+ python utils/notification_service.py "${{ needs.setup.outputs.matrix }}"
docs/transformers/.github/workflows/self-scheduled-amd-caller.yml ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (AMD scheduled CI caller)
2
+
3
+ on:
4
+ schedule:
5
+ - cron: "17 2 * * *"
6
+
7
+ jobs:
8
+ run_scheduled_amd_ci:
9
+ name: Trigger Scheduled AMD CI
10
+ runs-on: ubuntu-22.04
11
+ if: ${{ always() }}
12
+ steps:
13
+ - name: Trigger scheduled AMD CI via workflow_run
14
+ run: echo "Trigger scheduled AMD CI via workflow_run"
docs/transformers/.github/workflows/self-scheduled-amd-mi210-caller.yml ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (AMD mi210 scheduled CI caller)
2
+
3
+ on:
4
+ workflow_run:
5
+ workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
6
+ branches: ["main"]
7
+ types: [completed]
8
+ push:
9
+ branches:
10
+ - run_amd_scheduled_ci_caller*
11
+
12
+ jobs:
13
+ model-ci:
14
+ name: Model CI
15
+ uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
16
+ with:
17
+ job: run_models_gpu
18
+ slack_report_channel: "#transformers-ci-daily-amd"
19
+ runner: mi210
20
+ docker: huggingface/transformers-pytorch-amd-gpu
21
+ ci_event: Scheduled CI (AMD) - mi210
22
+ secrets: inherit
23
+
24
+ torch-pipeline:
25
+ name: Torch pipeline CI
26
+ uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
27
+ with:
28
+ job: run_pipelines_torch_gpu
29
+ slack_report_channel: "#transformers-ci-daily-amd"
30
+ runner: mi210
31
+ docker: huggingface/transformers-pytorch-amd-gpu
32
+ ci_event: Scheduled CI (AMD) - mi210
33
+ secrets: inherit
34
+
35
+ example-ci:
36
+ name: Example CI
37
+ uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
38
+ with:
39
+ job: run_examples_gpu
40
+ slack_report_channel: "#transformers-ci-daily-amd"
41
+ runner: mi210
42
+ docker: huggingface/transformers-pytorch-amd-gpu
43
+ ci_event: Scheduled CI (AMD) - mi210
44
+ secrets: inherit
45
+
46
+ deepspeed-ci:
47
+ name: DeepSpeed CI
48
+ uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@main
49
+ with:
50
+ job: run_torch_cuda_extensions_gpu
51
+ slack_report_channel: "#transformers-ci-daily-amd"
52
+ runner: mi210
53
+ docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
54
+ ci_event: Scheduled CI (AMD) - mi210
55
+ secrets: inherit
docs/transformers/.github/workflows/self-scheduled.yml ADDED
@@ -0,0 +1,598 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Self-hosted runner (scheduled)
2
+
3
+ # Note that each job's dependencies go into a corresponding docker file.
4
+ #
5
+ # For example for `run_torch_cuda_extensions_gpu` the docker image is
6
+ # `huggingface/transformers-pytorch-deepspeed-latest-gpu`, which can be found at
7
+ # `docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile`
8
+
9
+ on:
10
+ workflow_call:
11
+ inputs:
12
+ job:
13
+ required: true
14
+ type: string
15
+ slack_report_channel:
16
+ required: true
17
+ type: string
18
+ runner:
19
+ required: true
20
+ type: string
21
+ docker:
22
+ required: true
23
+ type: string
24
+ ci_event:
25
+ required: true
26
+ type: string
27
+ working-directory-prefix:
28
+ default: ''
29
+ required: false
30
+ type: string
31
+
32
+ env:
33
+ HF_HOME: /mnt/cache
34
+ TRANSFORMERS_IS_CI: yes
35
+ OMP_NUM_THREADS: 8
36
+ MKL_NUM_THREADS: 8
37
+ RUN_SLOW: yes
38
+ # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
39
+ # This token is created under the bot `hf-transformers-bot`.
40
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
41
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
42
+ TF_FORCE_GPU_ALLOW_GROWTH: true
43
+ CUDA_VISIBLE_DEVICES: 0,1
44
+ NUM_SLICES: 2
45
+
46
+ jobs:
47
+ setup:
48
+ if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu", "run_quantization_torch_gpu"]'), inputs.job)
49
+ name: Setup
50
+ strategy:
51
+ matrix:
52
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
53
+ runs-on:
54
+ group: '${{ matrix.machine_type }}'
55
+ container:
56
+ image: huggingface/transformers-all-latest-gpu
57
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
58
+ outputs:
59
+ folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}
60
+ slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}
61
+ quantization_matrix: ${{ steps.set-matrix-quantization.outputs.quantization_matrix }}
62
+ steps:
63
+ - name: Update clone
64
+ working-directory: /transformers
65
+ run: |
66
+ git fetch && git checkout ${{ github.sha }}
67
+
68
+ - name: Cleanup
69
+ working-directory: /transformers
70
+ run: |
71
+ rm -rf tests/__pycache__
72
+ rm -rf tests/models/__pycache__
73
+ rm -rf reports
74
+
75
+ - name: Show installed libraries and their versions
76
+ working-directory: /transformers
77
+ run: pip freeze
78
+
79
+ - id: set-matrix
80
+ if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)
81
+ name: Identify models to test
82
+ working-directory: /transformers/tests
83
+ run: |
84
+ if [ "${{ inputs.job }}" = "run_models_gpu" ]; then
85
+ echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT
86
+ echo "slice_ids=$(python3 -c 'd = list(range(${{ env.NUM_SLICES }})); print(d)')" >> $GITHUB_OUTPUT
87
+ elif [ "${{ inputs.job }}" = "run_trainer_and_fsdp_gpu" ]; then
88
+ echo "folder_slices=[['trainer'], ['fsdp']]" >> $GITHUB_OUTPUT
89
+ echo "slice_ids=[0, 1]" >> $GITHUB_OUTPUT
90
+ fi
91
+
92
+ - id: set-matrix-quantization
93
+ if: ${{ inputs.job == 'run_quantization_torch_gpu' }}
94
+ name: Identify quantization method to test
95
+ working-directory: /transformers/tests
96
+ run: |
97
+ echo "quantization_matrix=$(python3 -c 'import os; tests = os.getcwd(); quantization_tests = os.listdir(os.path.join(tests, "quantization")); d = sorted(list(filter(os.path.isdir, [f"quantization/{x}" for x in quantization_tests]))) ; print(d)')" >> $GITHUB_OUTPUT
98
+
99
+ - name: NVIDIA-SMI
100
+ run: |
101
+ nvidia-smi
102
+
103
+ run_models_gpu:
104
+ if: ${{ inputs.job == 'run_models_gpu' }}
105
+ name: " "
106
+ needs: setup
107
+ strategy:
108
+ fail-fast: false
109
+ matrix:
110
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
111
+ slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
112
+ uses: ./.github/workflows/model_jobs.yml
113
+ with:
114
+ folder_slices: ${{ needs.setup.outputs.folder_slices }}
115
+ machine_type: ${{ matrix.machine_type }}
116
+ slice_id: ${{ matrix.slice_id }}
117
+ runner: ${{ inputs.runner }}
118
+ docker: ${{ inputs.docker }}
119
+ secrets: inherit
120
+
121
+ run_trainer_and_fsdp_gpu:
122
+ if: ${{ inputs.job == 'run_trainer_and_fsdp_gpu' }}
123
+ name: " "
124
+ needs: setup
125
+ strategy:
126
+ fail-fast: false
127
+ matrix:
128
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
129
+ slice_id: [0, 1]
130
+ uses: ./.github/workflows/model_jobs.yml
131
+ with:
132
+ folder_slices: ${{ needs.setup.outputs.folder_slices }}
133
+ machine_type: ${{ matrix.machine_type }}
134
+ slice_id: ${{ matrix.slice_id }}
135
+ runner: ${{ inputs.runner }}
136
+ docker: ${{ inputs.docker }}
137
+ report_name_prefix: run_trainer_and_fsdp_gpu
138
+ secrets: inherit
139
+
140
+ run_pipelines_torch_gpu:
141
+ if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}
142
+ name: PyTorch pipelines
143
+ strategy:
144
+ fail-fast: false
145
+ matrix:
146
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
147
+ runs-on:
148
+ group: '${{ matrix.machine_type }}'
149
+ container:
150
+ image: huggingface/transformers-pytorch-gpu
151
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
152
+ steps:
153
+ - name: Update clone
154
+ working-directory: /transformers
155
+ run: git fetch && git checkout ${{ github.sha }}
156
+
157
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
158
+ working-directory: /transformers
159
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
160
+
161
+ - name: NVIDIA-SMI
162
+ run: |
163
+ nvidia-smi
164
+
165
+ - name: Environment
166
+ working-directory: /transformers
167
+ run: |
168
+ python3 utils/print_env.py
169
+
170
+ - name: Show installed libraries and their versions
171
+ working-directory: /transformers
172
+ run: pip freeze
173
+
174
+ - name: Set `machine_type` for report and artifact names
175
+ working-directory: /transformers
176
+ shell: bash
177
+ run: |
178
+ echo "${{ matrix.machine_type }}"
179
+
180
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
181
+ machine_type=single-gpu
182
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
183
+ machine_type=multi-gpu
184
+ else
185
+ machine_type=${{ matrix.machine_type }}
186
+ fi
187
+
188
+ echo "$machine_type"
189
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
190
+
191
+ - name: Run all pipeline tests on GPU
192
+ working-directory: /transformers
193
+ run: |
194
+ python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports tests/pipelines
195
+
196
+ - name: Failure short reports
197
+ if: ${{ failure() }}
198
+ continue-on-error: true
199
+ run: cat /transformers/reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports/failures_short.txt
200
+
201
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports"
202
+ if: ${{ always() }}
203
+ uses: actions/upload-artifact@v4
204
+ with:
205
+ name: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
206
+ path: /transformers/reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
207
+
208
+ run_pipelines_tf_gpu:
209
+ if: ${{ inputs.job == 'run_pipelines_tf_gpu' }}
210
+ name: TensorFlow pipelines
211
+ strategy:
212
+ fail-fast: false
213
+ matrix:
214
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
215
+ runs-on:
216
+ group: '${{ matrix.machine_type }}'
217
+ container:
218
+ image: huggingface/transformers-tensorflow-gpu
219
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
220
+ steps:
221
+ - name: Update clone
222
+ working-directory: /transformers
223
+ run: |
224
+ git fetch && git checkout ${{ github.sha }}
225
+
226
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
227
+ working-directory: /transformers
228
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
229
+
230
+ - name: NVIDIA-SMI
231
+ run: |
232
+ nvidia-smi
233
+
234
+ - name: Environment
235
+ working-directory: /transformers
236
+ run: |
237
+ python3 utils/print_env.py
238
+
239
+ - name: Show installed libraries and their versions
240
+ working-directory: /transformers
241
+ run: pip freeze
242
+
243
+ - name: Set `machine_type` for report and artifact names
244
+ working-directory: /transformers
245
+ shell: bash
246
+ run: |
247
+ echo "${{ matrix.machine_type }}"
248
+
249
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
250
+ machine_type=single-gpu
251
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
252
+ machine_type=multi-gpu
253
+ else
254
+ machine_type=${{ matrix.machine_type }}
255
+ fi
256
+
257
+ echo "$machine_type"
258
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
259
+
260
+ - name: Run all pipeline tests on GPU
261
+ working-directory: /transformers
262
+ run: |
263
+ python3 -m pytest -n 1 -v --dist=loadfile --make-reports=${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports tests/pipelines
264
+
265
+ - name: Failure short reports
266
+ if: ${{ always() }}
267
+ run: |
268
+ cat /transformers/reports/${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports/failures_short.txt
269
+
270
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports"
271
+ if: ${{ always() }}
272
+ uses: actions/upload-artifact@v4
273
+ with:
274
+ name: ${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports
275
+ path: /transformers/reports/${{ env.machine_type }}_run_pipelines_tf_gpu_test_reports
276
+
277
+ run_examples_gpu:
278
+ if: ${{ inputs.job == 'run_examples_gpu' }}
279
+ name: Examples directory
280
+ strategy:
281
+ fail-fast: false
282
+ matrix:
283
+ machine_type: [aws-g4dn-2xlarge-cache]
284
+ runs-on:
285
+ group: '${{ matrix.machine_type }}'
286
+ container:
287
+ image: huggingface/transformers-all-latest-gpu
288
+ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
289
+ steps:
290
+ - name: Update clone
291
+ working-directory: /transformers
292
+ run: git fetch && git checkout ${{ github.sha }}
293
+
294
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
295
+ working-directory: /transformers
296
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
297
+
298
+ - name: NVIDIA-SMI
299
+ run: |
300
+ nvidia-smi
301
+
302
+ - name: Environment
303
+ working-directory: /transformers
304
+ run: |
305
+ python3 utils/print_env.py
306
+
307
+ - name: Show installed libraries and their versions
308
+ working-directory: /transformers
309
+ run: pip freeze
310
+
311
+ - name: Set `machine_type` for report and artifact names
312
+ working-directory: /transformers
313
+ shell: bash
314
+ run: |
315
+ echo "${{ matrix.machine_type }}"
316
+
317
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
318
+ machine_type=single-gpu
319
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
320
+ machine_type=multi-gpu
321
+ else
322
+ machine_type=${{ matrix.machine_type }}
323
+ fi
324
+
325
+ echo "$machine_type"
326
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
327
+
328
+ - name: Run examples tests on GPU
329
+ working-directory: /transformers
330
+ run: |
331
+ pip install -r examples/pytorch/_tests_requirements.txt
332
+ python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_examples_gpu_test_reports examples/pytorch
333
+
334
+ - name: Failure short reports
335
+ if: ${{ failure() }}
336
+ continue-on-error: true
337
+ run: cat /transformers/reports/${{ env.machine_type }}_run_examples_gpu_test_reports/failures_short.txt
338
+
339
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_examples_gpu_test_reports"
340
+ if: ${{ always() }}
341
+ uses: actions/upload-artifact@v4
342
+ with:
343
+ name: ${{ env.machine_type }}_run_examples_gpu_test_reports
344
+ path: /transformers/reports/${{ env.machine_type }}_run_examples_gpu_test_reports
345
+
346
+ run_torch_cuda_extensions_gpu:
347
+ if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}
348
+ name: Torch CUDA extension tests
349
+ strategy:
350
+ fail-fast: false
351
+ matrix:
352
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
353
+ runs-on:
354
+ group: '${{ matrix.machine_type }}'
355
+ container:
356
+ image: ${{ inputs.docker }}
357
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
358
+ steps:
359
+ - name: Update clone
360
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
361
+ run: git fetch && git checkout ${{ github.sha }}
362
+
363
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
364
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
365
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
366
+
367
+ - name: Update / Install some packages (for Past CI)
368
+ if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}
369
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
370
+ run: |
371
+ python3 -m pip install -U datasets
372
+ python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
373
+
374
+ - name: Remove cached torch extensions
375
+ run: rm -rf /github/home/.cache/torch_extensions/
376
+
377
+ # To avoid unknown test failures
378
+ - name: Pre build DeepSpeed *again* (for daily CI)
379
+ if: ${{ contains(inputs.ci_event, 'Daily CI') }}
380
+ working-directory: ${{ inputs.working-directory-prefix }}/
381
+ run: |
382
+ python3 -m pip uninstall -y deepspeed
383
+ DS_DISABLE_NINJA=1 DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
384
+
385
+ # To avoid unknown test failures
386
+ - name: Pre build DeepSpeed *again* (for nightly & Past CI)
387
+ if: ${{ contains(inputs.ci_event, 'Nightly CI') || contains(inputs.ci_event, 'Past CI') }}
388
+ working-directory: ${{ inputs.working-directory-prefix }}/
389
+ run: |
390
+ python3 -m pip uninstall -y deepspeed
391
+ rm -rf DeepSpeed
392
+ git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build
393
+ DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install . --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check
394
+
395
+ - name: NVIDIA-SMI
396
+ run: |
397
+ nvidia-smi
398
+
399
+ - name: Environment
400
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
401
+ run: |
402
+ python3 utils/print_env.py
403
+
404
+ - name: Show installed libraries and their versions
405
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
406
+ run: pip freeze
407
+
408
+ - name: Set `machine_type` for report and artifact names
409
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
410
+ shell: bash
411
+ run: |
412
+ echo "${{ matrix.machine_type }}"
413
+
414
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
415
+ machine_type=single-gpu
416
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
417
+ machine_type=multi-gpu
418
+ else
419
+ machine_type=${{ matrix.machine_type }}
420
+ fi
421
+
422
+ echo "$machine_type"
423
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
424
+
425
+ - name: Run all tests on GPU
426
+ working-directory: ${{ inputs.working-directory-prefix }}/transformers
427
+ run: |
428
+ python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports tests/deepspeed tests/extended
429
+
430
+ - name: Failure short reports
431
+ if: ${{ failure() }}
432
+ continue-on-error: true
433
+ run: cat ${{ inputs.working-directory-prefix }}/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt
434
+
435
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
436
+ if: ${{ always() }}
437
+ uses: actions/upload-artifact@v4
438
+ with:
439
+ name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
440
+ path: ${{ inputs.working-directory-prefix }}/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
441
+
442
+ run_quantization_torch_gpu:
443
+ if: ${{ inputs.job == 'run_quantization_torch_gpu' }}
444
+ name: " "
445
+ needs: setup
446
+ strategy:
447
+ max-parallel: 4
448
+ fail-fast: false
449
+ matrix:
450
+ folders: ${{ fromJson(needs.setup.outputs.quantization_matrix) }}
451
+ machine_type: [aws-g4dn-2xlarge-cache, aws-g4dn-12xlarge-cache]
452
+ runs-on:
453
+ group: '${{ matrix.machine_type }}'
454
+ container:
455
+ image: huggingface/transformers-quantization-latest-gpu
456
+ options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
457
+ steps:
458
+ - name: Echo folder ${{ matrix.folders }}
459
+ shell: bash
460
+ run: |
461
+ echo "${{ matrix.folders }}"
462
+ matrix_folders=${{ matrix.folders }}
463
+ matrix_folders=${matrix_folders/'quantization/'/'quantization_'}
464
+ echo "$matrix_folders"
465
+ echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
466
+
467
+ - name: Update clone
468
+ working-directory: /transformers
469
+ run: git fetch && git checkout ${{ github.sha }}
470
+
471
+ - name: Reinstall transformers in edit mode (remove the one installed during docker image build)
472
+ working-directory: /transformers
473
+ run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
474
+
475
+ - name: NVIDIA-SMI
476
+ run: |
477
+ nvidia-smi
478
+
479
+ - name: Environment
480
+ working-directory: /transformers
481
+ run: |
482
+ python3 utils/print_env.py
483
+
484
+ - name: Show installed libraries and their versions
485
+ working-directory: /transformers
486
+ run: pip freeze
487
+
488
+ - name: Set `machine_type` for report and artifact names
489
+ working-directory: /transformers
490
+ shell: bash
491
+ run: |
492
+ echo "${{ matrix.machine_type }}"
493
+
494
+ if [ "${{ matrix.machine_type }}" = "aws-g4dn-2xlarge-cache" ]; then
495
+ machine_type=single-gpu
496
+ elif [ "${{ matrix.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
497
+ machine_type=multi-gpu
498
+ else
499
+ machine_type=${{ matrix.machine_type }}
500
+ fi
501
+
502
+ echo "$machine_type"
503
+ echo "machine_type=$machine_type" >> $GITHUB_ENV
504
+
505
+ - name: Run quantization tests on GPU
506
+ working-directory: /transformers
507
+ run: |
508
+ python3 -m pytest -v --make-reports=${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports tests/${{ matrix.folders }}
509
+
510
+ - name: Failure short reports
511
+ if: ${{ failure() }}
512
+ continue-on-error: true
513
+ run: cat /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports/failures_short.txt
514
+
515
+ - name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"
516
+ if: ${{ always() }}
517
+ uses: actions/upload-artifact@v4
518
+ with:
519
+ name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
520
+ path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ matrix.folders }}_test_reports
521
+
522
+ run_extract_warnings:
523
+ # Let's only do this for the job `run_models_gpu` to simplify the (already complex) logic.
524
+ if: ${{ always() && inputs.job == 'run_models_gpu' }}
525
+ name: Extract warnings in CI artifacts
526
+ runs-on: ubuntu-22.04
527
+ needs: [setup, run_models_gpu]
528
+ steps:
529
+ - name: Checkout transformers
530
+ uses: actions/checkout@v4
531
+ with:
532
+ fetch-depth: 2
533
+
534
+ - name: Install transformers
535
+ run: pip install transformers
536
+
537
+ - name: Show installed libraries and their versions
538
+ run: pip freeze
539
+
540
+ - name: Create output directory
541
+ run: mkdir warnings_in_ci
542
+
543
+ - uses: actions/download-artifact@v4
544
+ with:
545
+ path: warnings_in_ci
546
+
547
+ - name: Show artifacts
548
+ run: echo "$(python3 -c 'import os; d = os.listdir(); print(d)')"
549
+ working-directory: warnings_in_ci
550
+
551
+ - name: Extract warnings in CI artifacts
552
+ run: |
553
+ python3 utils/extract_warnings.py --workflow_run_id ${{ github.run_id }} --output_dir warnings_in_ci --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }} --from_gh
554
+ echo "$(python3 -c 'import os; import json; fp = open("warnings_in_ci/selected_warnings.json"); d = json.load(fp); d = "\n".join(d) ;print(d)')"
555
+
556
+ - name: Upload artifact
557
+ if: ${{ always() }}
558
+ uses: actions/upload-artifact@v4
559
+ with:
560
+ name: warnings_in_ci
561
+ path: warnings_in_ci/selected_warnings.json
562
+
563
+ send_results:
564
+ name: Slack Report
565
+ needs: [
566
+ setup,
567
+ run_models_gpu,
568
+ run_trainer_and_fsdp_gpu,
569
+ run_pipelines_torch_gpu,
570
+ run_pipelines_tf_gpu,
571
+ run_examples_gpu,
572
+ run_torch_cuda_extensions_gpu,
573
+ run_quantization_torch_gpu,
574
+ run_extract_warnings
575
+ ]
576
+ if: ${{ always() }}
577
+ uses: ./.github/workflows/slack-report.yml
578
+ with:
579
+ job: ${{ inputs.job }}
580
+ # This would be `skipped` if `setup` is skipped.
581
+ setup_status: ${{ needs.setup.result }}
582
+ slack_report_channel: ${{ inputs.slack_report_channel }}
583
+ # This would be an empty string if `setup` is skipped.
584
+ folder_slices: ${{ needs.setup.outputs.folder_slices }}
585
+ quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}
586
+ ci_event: ${{ inputs.ci_event }}
587
+
588
+ secrets: inherit
589
+
590
+ check_new_model_failures:
591
+ if: ${{ always() && inputs.ci_event == 'Daily CI' && inputs.job == 'run_models_gpu' && needs.send_results.result == 'success' }}
592
+ name: Check new model failures
593
+ needs: send_results
594
+ uses: ./.github/workflows/check_failed_model_tests.yml
595
+ with:
596
+ docker: ${{ inputs.docker }}
597
+ start_sha: ${{ github.sha }}
598
+ secrets: inherit
docs/transformers/.github/workflows/ssh-runner.yml ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: SSH into our runners
2
+
3
+ on:
4
+ workflow_dispatch:
5
+ inputs:
6
+ runner_type:
7
+ description: 'Type of runner to test (a10 or t4)'
8
+ required: true
9
+ docker_image:
10
+ description: 'Name of the Docker image'
11
+ required: true
12
+ num_gpus:
13
+ description: 'Type of the number of gpus to use (`single` or `multi`)'
14
+ required: true
15
+
16
+ env:
17
+ HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
18
+ HF_HOME: /mnt/cache
19
+ TRANSFORMERS_IS_CI: yes
20
+ OMP_NUM_THREADS: 8
21
+ MKL_NUM_THREADS: 8
22
+ RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
23
+ SIGOPT_API_TOKEN: ${{ secrets.SIGOPT_API_TOKEN }}
24
+ TF_FORCE_GPU_ALLOW_GROWTH: true
25
+ CUDA_VISIBLE_DEVICES: 0,1
26
+
27
+ jobs:
28
+ get_runner:
29
+ name: "Get runner to use"
30
+ runs-on: ubuntu-22.04
31
+ outputs:
32
+ RUNNER: ${{ steps.set_runner.outputs.RUNNER }}
33
+ steps:
34
+ - name: Get runner to use
35
+ shell: bash
36
+ run: |
37
+ if [[ "${{ github.event.inputs.num_gpus }}" == "single" && "${{ github.event.inputs.runner_type }}" == "t4" ]]; then
38
+ echo "RUNNER=aws-g4dn-2xlarge-cache" >> $GITHUB_ENV
39
+ elif [[ "${{ github.event.inputs.num_gpus }}" == "multi" && "${{ github.event.inputs.runner_type }}" == "t4" ]]; then
40
+ echo "RUNNER=aws-g4dn-12xlarge-cache" >> $GITHUB_ENV
41
+ elif [[ "${{ github.event.inputs.num_gpus }}" == "single" && "${{ github.event.inputs.runner_type }}" == "a10" ]]; then
42
+ echo "RUNNER=aws-g5-4xlarge-cache" >> $GITHUB_ENV
43
+ elif [[ "${{ github.event.inputs.num_gpus }}" == "multi" && "${{ github.event.inputs.runner_type }}" == "a10" ]]; then
44
+ echo "RUNNER=aws-g5-12xlarge-cache" >> $GITHUB_ENV
45
+ else
46
+ echo "RUNNER=" >> $GITHUB_ENV
47
+ fi
48
+
49
+ - name: Set runner to use
50
+ id: set_runner
51
+ run: |
52
+ echo ${{ env.RUNNER }}
53
+ echo "RUNNER=${{ env.RUNNER }}" >> $GITHUB_OUTPUT
54
+
55
+ ssh_runner:
56
+ name: "SSH"
57
+ needs: get_runner
58
+ runs-on:
59
+ group: ${{ needs.get_runner.outputs.RUNNER }}
60
+ container:
61
+ image: ${{ github.event.inputs.docker_image }}
62
+ options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
63
+
64
+ steps:
65
+ - name: Update clone
66
+ working-directory: /transformers
67
+ run: |
68
+ git fetch && git checkout ${{ github.sha }}
69
+
70
+ - name: Cleanup
71
+ working-directory: /transformers
72
+ run: |
73
+ rm -rf tests/__pycache__
74
+ rm -rf tests/models/__pycache__
75
+ rm -rf reports
76
+
77
+ - name: Show installed libraries and their versions
78
+ working-directory: /transformers
79
+ run: pip freeze
80
+
81
+ - name: NVIDIA-SMI
82
+ run: |
83
+ nvidia-smi
84
+
85
+ - name: Store Slack infos
86
+ #because the SSH can be enabled dynamically if the workflow failed, so we need to store slack infos to be able to retrieve them during the waitforssh step
87
+ shell: bash
88
+ run: |
89
+ echo "${{ github.actor }}"
90
+ github_actor=${{ github.actor }}
91
+ github_actor=${github_actor/'-'/'_'}
92
+ echo "$github_actor"
93
+ echo "github_actor=$github_actor" >> $GITHUB_ENV
94
+
95
+ - name: Store Slack infos
96
+ #because the SSH can be enabled dynamically if the workflow failed, so we need to store slack infos to be able to retrieve them during the waitforssh step
97
+ shell: bash
98
+ run: |
99
+ echo "${{ env.github_actor }}"
100
+ if [ "${{ secrets[format('{0}_{1}', env.github_actor, 'SLACK_ID')] }}" != "" ]; then
101
+ echo "SLACKCHANNEL=${{ secrets[format('{0}_{1}', env.github_actor, 'SLACK_ID')] }}" >> $GITHUB_ENV
102
+ else
103
+ echo "SLACKCHANNEL=${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}" >> $GITHUB_ENV
104
+ fi
105
+
106
+ - name: Tailscale # In order to be able to SSH when a test fails
107
+ uses: huggingface/tailscale-action@main
108
+ with:
109
+ authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
110
+ slackChannel: ${{ env.SLACKCHANNEL }}
111
+ slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
112
+ waitForSSH: true
113
+ sshTimeout: 15m
docs/transformers/.github/workflows/stale.yml ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Stale Bot
2
+
3
+ on:
4
+ schedule:
5
+ - cron: "0 8 * * *"
6
+
7
+ jobs:
8
+ close_stale_issues:
9
+ name: Close Stale Issues
10
+ if: github.repository == 'huggingface/transformers'
11
+ runs-on: ubuntu-22.04
12
+ permissions:
13
+ issues: write
14
+ env:
15
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
16
+ steps:
17
+ - uses: actions/checkout@v4
18
+
19
+ - name: Setup Python
20
+ uses: actions/setup-python@v5
21
+ with:
22
+ python-version: 3.8
23
+
24
+ - name: Install requirements
25
+ run: |
26
+ pip install PyGithub
27
+ - name: Close stale issues
28
+ run: |
29
+ python scripts/stale.py
docs/transformers/.github/workflows/trufflehog.yml ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ on:
2
+ push:
3
+
4
+ name: Secret Leaks
5
+
6
+ permissions:
7
+ contents: read
8
+
9
+ jobs:
10
+ trufflehog:
11
+ runs-on: ubuntu-latest
12
+ steps:
13
+ - name: Checkout code
14
+ uses: actions/checkout@v4
15
+ with:
16
+ fetch-depth: 0
17
+ - name: Secret Scanning
18
+ uses: trufflesecurity/trufflehog@main
19
+ with:
20
+ extra_args: --results=verified,unknown
docs/transformers/.github/workflows/update_metdata.yml ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Update Transformers metadata
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - update_transformers_metadata*
8
+
9
+ jobs:
10
+ build_and_package:
11
+ runs-on: ubuntu-22.04
12
+ defaults:
13
+ run:
14
+ shell: bash -l {0}
15
+
16
+ steps:
17
+ - uses: actions/checkout@v4
18
+
19
+ - name: Setup environment
20
+ run: |
21
+ pip install --upgrade pip
22
+ pip install datasets pandas
23
+ pip install .[torch,tf,flax]
24
+
25
+ - name: Update metadata
26
+ run: |
27
+ python utils/update_metadata.py --token ${{ secrets.LYSANDRE_HF_TOKEN }} --commit_sha ${{ github.sha }}
docs/transformers/.github/workflows/upload_pr_documentation.yml ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Upload PR Documentation
2
+
3
+ on:
4
+ workflow_run:
5
+ workflows: ["Build PR Documentation"]
6
+ types:
7
+ - completed
8
+
9
+ jobs:
10
+ build:
11
+ uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main
12
+ with:
13
+ package_name: transformers
14
+ secrets:
15
+ hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
16
+ comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
docs/transformers/README.md ADDED
@@ -0,0 +1,322 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!---
2
+ Copyright 2020 The HuggingFace Team. All rights reserved.
3
+
4
+ Licensed under the Apache License, Version 2.0 (the "License");
5
+ you may not use this file except in compliance with the License.
6
+ You may obtain a copy of the License at
7
+
8
+ http://www.apache.org/licenses/LICENSE-2.0
9
+
10
+ Unless required by applicable law or agreed to in writing, software
11
+ distributed under the License is distributed on an "AS IS" BASIS,
12
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ See the License for the specific language governing permissions and
14
+ limitations under the License.
15
+ -->
16
+
17
+ <p align="center">
18
+ <picture>
19
+ <source media="(prefers-color-scheme: dark)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-dark.svg">
20
+ <source media="(prefers-color-scheme: light)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-light.svg">
21
+ <img alt="Hugging Face Transformers Library" src="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-light.svg" width="352" height="59" style="max-width: 100%;">
22
+ </picture>
23
+ <br/>
24
+ <br/>
25
+ </p>
26
+
27
+ <p align="center">
28
+ <a href="https://huggingface.com/models"><img alt="Checkpoints on Hub" src="https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen"></a>
29
+ <a href="https://circleci.com/gh/huggingface/transformers"><img alt="Build" src="https://img.shields.io/circleci/build/github/huggingface/transformers/main"></a>
30
+ <a href="https://github.com/huggingface/transformers/blob/main/LICENSE"><img alt="GitHub" src="https://img.shields.io/github/license/huggingface/transformers.svg?color=blue"></a>
31
+ <a href="https://huggingface.co/docs/transformers/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers/index.svg?down_color=red&down_message=offline&up_message=online"></a>
32
+ <a href="https://github.com/huggingface/transformers/releases"><img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/transformers.svg"></a>
33
+ <a href="https://github.com/huggingface/transformers/blob/main/CODE_OF_CONDUCT.md"><img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg"></a>
34
+ <a href="https://zenodo.org/badge/latestdoi/155220641"><img src="https://zenodo.org/badge/155220641.svg" alt="DOI"></a>
35
+ </p>
36
+
37
+ <h4 align="center">
38
+ <p>
39
+ <b>English</b> |
40
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hans.md">简体中文</a> |
41
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hant.md">繁體中文</a> |
42
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ko.md">한국어</a> |
43
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_es.md">Español</a> |
44
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ja.md">日本語</a> |
45
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_hd.md">हिन्दी</a> |
46
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ru.md">Русский</a> |
47
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_pt-br.md">Рortuguês</a> |
48
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_te.md">తెలుగు</a> |
49
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_fr.md">Français</a> |
50
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_de.md">Deutsch</a> |
51
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_vi.md">Tiếng Việt</a> |
52
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ar.md">العربية</a> |
53
+ <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ur.md">اردو</a> |
54
+ </p>
55
+ </h4>
56
+
57
+ <h3 align="center">
58
+ <p>State-of-the-art pretrained models for inference and training</p>
59
+ </h3>
60
+
61
+ <h3 align="center">
62
+ <a href="https://hf.co/course"><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/course_banner.png"></a>
63
+ </h3>
64
+
65
+ Transformers is a library of pretrained text, computer vision, audio, video, and multimodal models for inference and training. Use Transformers to fine-tune models on your data, build inference applications, and for generative AI use cases across multiple modalities.
66
+
67
+ There are over 500K+ Transformers [model checkpoints](https://huggingface.co/models?library=transformers&sort=trending) on the [Hugging Face Hub](https://huggingface.com/models) you can use.
68
+
69
+ Explore the [Hub](https://huggingface.com/) today to find a model and use Transformers to help you get started right away.
70
+
71
+ ## Installation
72
+
73
+ Transformers works with Python 3.9+ [PyTorch](https://pytorch.org/get-started/locally/) 2.1+, [TensorFlow](https://www.tensorflow.org/install/pip) 2.6+, and [Flax](https://flax.readthedocs.io/en/latest/) 0.4.1+.
74
+
75
+ Create and activate a virtual environment with [venv](https://docs.python.org/3/library/venv.html) or [uv](https://docs.astral.sh/uv/), a fast Rust-based Python package and project manager.
76
+
77
+ ```py
78
+ # venv
79
+ python -m venv .my-env
80
+ source .my-env/bin/activate
81
+
82
+ # uv
83
+ uv venv .my-env
84
+ source .my-env/bin/activate
85
+ ```
86
+
87
+ Install Transformers in your virtual environment.
88
+
89
+ ```py
90
+ # pip
91
+ pip install transformers
92
+
93
+ # uv
94
+ uv pip install transformers
95
+ ```
96
+
97
+ Install Transformers from source if you want the latest changes in the library or are interested in contributing. However, the *latest* version may not be stable. Feel free to open an [issue](https://github.com/huggingface/transformers/issues) if you encounter an error.
98
+
99
+ ```shell
100
+ git clone https://github.com/huggingface/transformers.git
101
+ cd transformers
102
+ pip install .
103
+ ```
104
+
105
+ ## Quickstart
106
+
107
+ Get started with Transformers right away with the [Pipeline](https://huggingface.co/docs/transformers/pipeline_tutorial) API. The `Pipeline` is a high-level inference class that supports text, audio, vision, and multimodal tasks. It handles preprocessing the input and returns the appropriate output.
108
+
109
+ Instantiate a pipeline and specify model to use for text generation. The model is downloaded and cached so you can easily reuse it again. Finally, pass some text to prompt the model.
110
+
111
+ ```py
112
+ from transformers import pipeline
113
+
114
+ pipeline = pipeline(task="text-generation", model="Qwen/Qwen2.5-1.5B")
115
+ pipeline("the secret to baking a really good cake is ")
116
+ [{'generated_text': 'the secret to baking a really good cake is 1) to use the right ingredients and 2) to follow the recipe exactly. the recipe for the cake is as follows: 1 cup of sugar, 1 cup of flour, 1 cup of milk, 1 cup of butter, 1 cup of eggs, 1 cup of chocolate chips. if you want to make 2 cakes, how much sugar do you need? To make 2 cakes, you will need 2 cups of sugar.'}]
117
+ ```
118
+
119
+ To chat with a model, the usage pattern is the same. The only difference is you need to construct a chat history (the input to `Pipeline`) between you and the system.
120
+
121
+ > [!TIP]
122
+ > You can also chat with a model directly from the command line.
123
+ > ```shell
124
+ > transformers-cli chat --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct
125
+ > ```
126
+
127
+ ```py
128
+ import torch
129
+ from transformers import pipeline
130
+
131
+ chat = [
132
+ {"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
133
+ {"role": "user", "content": "Hey, can you tell me any fun things to do in New York?"}
134
+ ]
135
+
136
+ pipeline = pipeline(task="text-generation", model="meta-llama/Meta-Llama-3-8B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")
137
+ response = pipeline(chat, max_new_tokens=512)
138
+ print(response[0]["generated_text"][-1]["content"])
139
+ ```
140
+
141
+ Expand the examples below to see how `Pipeline` works for different modalities and tasks.
142
+
143
+ <details>
144
+ <summary>Automatic speech recognition</summary>
145
+
146
+ ```py
147
+ from transformers import pipeline
148
+
149
+ pipeline = pipeline(task="automatic-speech-recognition", model="openai/whisper-large-v3")
150
+ pipeline("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
151
+ {'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}
152
+ ```
153
+
154
+ </details>
155
+
156
+ <details>
157
+ <summary>Image classification</summary>
158
+
159
+ <h3 align="center">
160
+ <a><img src="https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"></a>
161
+ </h3>
162
+
163
+ ```py
164
+ from transformers import pipeline
165
+
166
+ pipeline = pipeline(task="image-classification", model="facebook/dinov2-small-imagenet1k-1-layer")
167
+ pipeline("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png")
168
+ [{'label': 'macaw', 'score': 0.997848391532898},
169
+ {'label': 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
170
+ 'score': 0.0016551691805943847},
171
+ {'label': 'lorikeet', 'score': 0.00018523589824326336},
172
+ {'label': 'African grey, African gray, Psittacus erithacus',
173
+ 'score': 7.85409429227002e-05},
174
+ {'label': 'quail', 'score': 5.502637941390276e-05}]
175
+ ```
176
+
177
+ </details>
178
+
179
+ <details>
180
+ <summary>Visual question answering</summary>
181
+
182
+
183
+ <h3 align="center">
184
+ <a><img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg"></a>
185
+ </h3>
186
+
187
+ ```py
188
+ from transformers import pipeline
189
+
190
+ pipeline = pipeline(task="visual-question-answering", model="Salesforce/blip-vqa-base")
191
+ pipeline(
192
+ image="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg",
193
+ question="What is in the image?",
194
+ )
195
+ [{'answer': 'statue of liberty'}]
196
+ ```
197
+
198
+ </details>
199
+
200
+ ## Why should I use Transformers?
201
+
202
+ 1. Easy-to-use state-of-the-art models:
203
+ - High performance on natural language understanding & generation, computer vision, audio, video, and multimodal tasks.
204
+ - Low barrier to entry for researchers, engineers, and developers.
205
+ - Few user-facing abstractions with just three classes to learn.
206
+ - A unified API for using all our pretrained models.
207
+
208
+ 1. Lower compute costs, smaller carbon footprint:
209
+ - Share trained models instead of training from scratch.
210
+ - Reduce compute time and production costs.
211
+ - Dozens of model architectures with 1M+ pretrained checkpoints across all modalities.
212
+
213
+ 1. Choose the right framework for every part of a models lifetime:
214
+ - Train state-of-the-art models in 3 lines of code.
215
+ - Move a single model between PyTorch/JAX/TF2.0 frameworks at will.
216
+ - Pick the right framework for training, evaluation, and production.
217
+
218
+ 1. Easily customize a model or an example to your needs:
219
+ - We provide examples for each architecture to reproduce the results published by its original authors.
220
+ - Model internals are exposed as consistently as possible.
221
+ - Model files can be used independently of the library for quick experiments.
222
+
223
+ <a target="_blank" href="https://huggingface.co/enterprise">
224
+ <img alt="Hugging Face Enterprise Hub" src="https://github.com/user-attachments/assets/247fb16d-d251-4583-96c4-d3d76dda4925">
225
+ </a><br>
226
+
227
+ ## Why shouldn't I use Transformers?
228
+
229
+ - This library is not a modular toolbox of building blocks for neural nets. The code in the model files is not refactored with additional abstractions on purpose, so that researchers can quickly iterate on each of the models without diving into additional abstractions/files.
230
+ - The training API is optimized to work with PyTorch models provided by Transformers. For generic machine learning loops, you should use another library like [Accelerate](https://huggingface.co/docs/accelerate).
231
+ - The [example scripts]((https://github.com/huggingface/transformers/tree/main/examples)) are only *examples*. They may not necessarily work out-of-the-box on your specific use case and you'll need to adapt the code for it to work.
232
+
233
+ ## 100 projects using Transformers
234
+
235
+ Transformers is more than a toolkit to use pretrained models, it's a community of projects built around it and the
236
+ Hugging Face Hub. We want Transformers to enable developers, researchers, students, professors, engineers, and anyone
237
+ else to build their dream projects.
238
+
239
+ In order to celebrate Transformers 100,000 stars, we wanted to put the spotlight on the
240
+ community with the [awesome-transformers](./awesome-transformers.md) page which lists 100
241
+ incredible projects built with Transformers.
242
+
243
+ If you own or use a project that you believe should be part of the list, please open a PR to add it!
244
+
245
+ ## Example models
246
+
247
+ You can test most of our models directly on their [Hub model pages](https://huggingface.co/models).
248
+
249
+ Expand each modality below to see a few example models for various use cases.
250
+
251
+ <details>
252
+ <summary>Audio</summary>
253
+
254
+ - Audio classification with [Whisper](https://huggingface.co/openai/whisper-large-v3-turbo)
255
+ - Automatic speech recognition with [Moonshine](https://huggingface.co/UsefulSensors/moonshine)
256
+ - Keyword spotting with [Wav2Vec2](https://huggingface.co/superb/wav2vec2-base-superb-ks)
257
+ - Speech to speech generation with [Moshi](https://huggingface.co/kyutai/moshiko-pytorch-bf16)
258
+ - Text to audio with [MusicGen](https://huggingface.co/facebook/musicgen-large)
259
+ - Text to speech with [Bark](https://huggingface.co/suno/bark)
260
+
261
+ </details>
262
+
263
+ <details>
264
+ <summary>Computer vision</summary>
265
+
266
+ - Automatic mask generation with [SAM](https://huggingface.co/facebook/sam-vit-base)
267
+ - Depth estimation with [DepthPro](https://huggingface.co/apple/DepthPro-hf)
268
+ - Image classification with [DINO v2](https://huggingface.co/facebook/dinov2-base)
269
+ - Keypoint detection with [SuperGlue](https://huggingface.co/magic-leap-community/superglue_outdoor)
270
+ - Keypoint matching with [SuperGlue](https://huggingface.co/magic-leap-community/superglue)
271
+ - Object detection with [RT-DETRv2](https://huggingface.co/PekingU/rtdetr_v2_r50vd)
272
+ - Pose Estimation with [VitPose](https://huggingface.co/usyd-community/vitpose-base-simple)
273
+ - Universal segmentation with [OneFormer](https://huggingface.co/shi-labs/oneformer_ade20k_swin_large)
274
+ - Video classification with [VideoMAE](https://huggingface.co/MCG-NJU/videomae-large)
275
+
276
+ </details>
277
+
278
+ <details>
279
+ <summary>Multimodal</summary>
280
+
281
+ - Audio or text to text with [Qwen2-Audio](https://huggingface.co/Qwen/Qwen2-Audio-7B)
282
+ - Document question answering with [LayoutLMv3](https://huggingface.co/microsoft/layoutlmv3-base)
283
+ - Image or text to text with [Qwen-VL](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)
284
+ - Image captioning [BLIP-2](https://huggingface.co/Salesforce/blip2-opt-2.7b)
285
+ - OCR-based document understanding with [GOT-OCR2](https://huggingface.co/stepfun-ai/GOT-OCR-2.0-hf)
286
+ - Table question answering with [TAPAS](https://huggingface.co/google/tapas-base)
287
+ - Unified multimodal understanding and generation with [Emu3](https://huggingface.co/BAAI/Emu3-Gen)
288
+ - Vision to text with [Llava-OneVision](https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf)
289
+ - Visual question answering with [Llava](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
290
+ - Visual referring expression segmentation with [Kosmos-2](https://huggingface.co/microsoft/kosmos-2-patch14-224)
291
+
292
+ </details>
293
+
294
+ <details>
295
+ <summary>NLP</summary>
296
+
297
+ - Masked word completion with [ModernBERT](https://huggingface.co/answerdotai/ModernBERT-base)
298
+ - Named entity recognition with [Gemma](https://huggingface.co/google/gemma-2-2b)
299
+ - Question answering with [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
300
+ - Summarization with [BART](https://huggingface.co/facebook/bart-large-cnn)
301
+ - Translation with [T5](https://huggingface.co/google-t5/t5-base)
302
+ - Text generation with [Llama](https://huggingface.co/meta-llama/Llama-3.2-1B)
303
+ - Text classification with [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B)
304
+
305
+ </details>
306
+
307
+ ## Citation
308
+
309
+ We now have a [paper](https://www.aclweb.org/anthology/2020.emnlp-demos.6/) you can cite for the 🤗 Transformers library:
310
+ ```bibtex
311
+ @inproceedings{wolf-etal-2020-transformers,
312
+ title = "Transformers: State-of-the-Art Natural Language Processing",
313
+ author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
314
+ booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
315
+ month = oct,
316
+ year = "2020",
317
+ address = "Online",
318
+ publisher = "Association for Computational Linguistics",
319
+ url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
320
+ pages = "38--45"
321
+ }
322
+ ```
docs/transformers/benchmark/README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Benchmarks
2
+
3
+ You might want to add new benchmarks.
4
+
5
+ You will need to define a python function named `run_benchmark` in your python file and the file must be located in this `benchmark/` directory.
6
+
7
+ The expected function signature is the following:
8
+
9
+ ```py
10
+ def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
11
+ ```
12
+
13
+ ## Writing metrics to the database
14
+
15
+ `MetricsRecorder` is thread-safe, in the sense of the python [`Thread`](https://docs.python.org/3/library/threading.html#threading.Thread). This means you can start a background thread to do the readings on the device measurements while not blocking the main thread to execute the model measurements.
16
+
17
+ cf [`llama.py`](./llama.py) to see an example of this in practice.
18
+
19
+ ```py
20
+ from benchmarks_entrypoint import MetricsRecorder
21
+ import psycopg2
22
+
23
+ def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
24
+ metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
25
+ benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
26
+ # To collect device measurements
27
+ metrics_recorder.collect_device_measurements(
28
+ benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
29
+ )
30
+ # To collect your model measurements
31
+ metrics_recorder.collect_model_measurements(
32
+ benchmark_id,
33
+ {
34
+ "model_load_time": model_load_time,
35
+ "first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
36
+ "second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
37
+ "first_eager_generate_time_secs": first_eager_generate_time,
38
+ "second_eager_generate_time_secs": second_eager_generate_time,
39
+ "time_to_first_token_secs": time_to_first_token,
40
+ "time_to_second_token_secs": time_to_second_token,
41
+ "time_to_third_token_secs": time_to_third_token,
42
+ "time_to_next_token_mean_secs": mean_time_to_next_token,
43
+ "first_compile_generate_time_secs": first_compile_generate_time,
44
+ "second_compile_generate_time_secs": second_compile_generate_time,
45
+ "third_compile_generate_time_secs": third_compile_generate_time,
46
+ "fourth_compile_generate_time_secs": fourth_compile_generate_time,
47
+ },
48
+ )
49
+ ```
docs/transformers/benchmark/__init__.py ADDED
File without changes
docs/transformers/benchmark/benchmark.py ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2024 The HuggingFace Team. All rights reserved.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+ """
16
+ Run benchmark using the `optimum-benchmark` library with some customization in `transformers`.
17
+
18
+ Assume we are under `transformers` root directory: (make sure the commits are valid commits)
19
+ ```bash
20
+ python benchmark/benchmark.py --config-dir benchmark/config --config-name generation --commit=9b9c7f03da625b13643e99205c691fe046461724 --metrics=decode.latency.mean,per_token.latency.mean,per_token.throughput.value backend.model=google/gemma-2b benchmark.input_shapes.sequence_length=5,7 benchmark.input_shapes.batch_size=1,2 --multirun
21
+ ```
22
+ """
23
+
24
+ import argparse
25
+ import glob
26
+ import json
27
+ import os.path
28
+ import re
29
+ import tempfile
30
+ from contextlib import contextmanager
31
+ from pathlib import Path
32
+
33
+ from git import Repo
34
+
35
+ from huggingface_hub import HfApi
36
+
37
+ from optimum_benchmark import Benchmark
38
+ from optimum_benchmark_wrapper import main
39
+
40
+
41
+ PATH_TO_REPO = Path(__file__).parent.parent.resolve()
42
+
43
+
44
+ @contextmanager
45
+ def checkout_commit(repo: Repo, commit_id: str):
46
+ """
47
+ Context manager that checks out a given commit when entered, but gets back to the reference it was at on exit.
48
+ Args:
49
+ repo (`git.Repo`): A git repository (for instance the Transformers repo).
50
+ commit_id (`str`): The commit reference to checkout inside the context manager.
51
+ """
52
+ current_head = repo.head.commit if repo.head.is_detached else repo.head.ref
53
+
54
+ try:
55
+ repo.git.checkout(commit_id)
56
+ yield
57
+
58
+ finally:
59
+ repo.git.checkout(current_head)
60
+
61
+
62
+ def summarize(run_dir, metrics, expand_metrics=False):
63
+ """Produce a summary for each optimum-benchmark launched job's output directory found in `run_dir`.
64
+
65
+ Each summary's format is as follows (for `expand_metrics=False`):
66
+ ```
67
+ {
68
+ "model": "google/gemma-2b",
69
+ "commit": "3cd6ed22e4d49219f300f5055e71e3929aba20d7",
70
+ "config": "benchmark.input_shapes.batch_size=1,benchmark.input_shapes.sequence_length=5",
71
+ "metrics": {
72
+ "decode.latency.mean": 1.624666809082031,
73
+ "per_token.latency.mean": 0.012843788806628804,
74
+ "per_token.throughput.value": 77.85864553330948
75
+ }
76
+ }
77
+ ```
78
+ """
79
+ reports = glob.glob(os.path.join(run_dir, "**/benchmark_report.json"), recursive=True)
80
+ report_dirs = [str(Path(report).parent) for report in reports]
81
+
82
+ summaries = []
83
+ for report_dir in report_dirs:
84
+ commit = re.search(r"/commit=([^/]+)", report_dir).groups()[0]
85
+
86
+ if not os.path.isfile(os.path.join(report_dir, "benchmark.json")):
87
+ continue
88
+ benchmark = Benchmark.from_json(os.path.join(report_dir, "benchmark.json"))
89
+ report = benchmark.report
90
+
91
+ model = benchmark.config.backend["model"]
92
+
93
+ # Ths looks like `benchmark.input_shapes.batch_size=1,benchmark.input_shapes.sequence_length=5`.
94
+ # (we rely on the usage of hydra's `${hydra.job.override_dirname}`.)
95
+ benchmark_name = re.sub(f"backend.model={model},*", "", report_dir)
96
+ benchmark_name = str(Path(benchmark_name).parts[-1])
97
+ if benchmark_name.startswith("commit="):
98
+ benchmark_name = benchmark.config.name
99
+
100
+ metrics_values = {}
101
+ # post-processing of report: show a few selected/important metric
102
+ for metric in metrics:
103
+ keys = metric.split(".")
104
+ value = report.to_dict()
105
+ current = metrics_values
106
+ for key in keys:
107
+ # Avoid KeyError when a user's specified metric has typo.
108
+ # TODO: Give warnings.
109
+ if key not in value:
110
+ continue
111
+ value = value[key]
112
+
113
+ if expand_metrics:
114
+ if isinstance(value, dict):
115
+ if key not in current:
116
+ current[key] = {}
117
+ current = current[key]
118
+ else:
119
+ current[key] = value
120
+
121
+ if not expand_metrics:
122
+ metrics_values[metric] = value
123
+
124
+ # show some config information
125
+ print(f"model: {model}")
126
+ print(f"commit: {commit}")
127
+ print(f"config: {benchmark_name}")
128
+ if len(metrics_values) > 0:
129
+ print("metrics:")
130
+ if expand_metrics:
131
+ print(metrics_values)
132
+ else:
133
+ for metric, value in metrics_values.items():
134
+ print(f" - {metric}: {value}")
135
+ print("-" * 80)
136
+
137
+ summary = {
138
+ "model": model,
139
+ "commit": commit,
140
+ "config": benchmark_name,
141
+ "metrics": metrics_values,
142
+ }
143
+ summaries.append(summary)
144
+
145
+ with open(os.path.join(report_dir, "summary.json"), "w") as fp:
146
+ json.dump(summary, fp, indent=4)
147
+
148
+ return summaries
149
+
150
+
151
+ def combine_summaries(summaries):
152
+ """Combine a list of summary obtained from the function `summarize`.
153
+
154
+ The combined summary's format is as follows:
155
+ ```
156
+ "google/gemma-2b": {
157
+ "benchmark.input_shapes.batch_size=1,benchmark.input_shapes.sequence_length=5": {
158
+ "3cd6ed22e4d49219f300f5055e71e3929aba20d7": {
159
+ "metrics": {"decode.latency.mean": 1.624666809082031}
160
+ },
161
+ "c97ee28b117c0abe8e08891f402065e4df6d72aa": {
162
+ "metrics": {"decode.latency.mean": 1.6278163452148438}
163
+ }
164
+ },
165
+ "benchmark.input_shapes.batch_size=2,benchmark.input_shapes.sequence_length=5": {
166
+ "3cd6ed22e4d49219f300f5055e71e3929aba20d7": {
167
+ "metrics": {"decode.latency.mean": 1.6947791748046876}
168
+ },
169
+ "c97ee28b117c0abe8e08891f402065e4df6d72aa": {
170
+ "metrics": {
171
+ "decode.latency.mean": 1.6980519409179688}
172
+ }
173
+ }
174
+ }
175
+ ```
176
+ """
177
+ combined = {}
178
+ for summary in summaries:
179
+ model = summary["model"]
180
+ config = summary["config"]
181
+ commit = summary["commit"]
182
+
183
+ if model not in combined:
184
+ combined[model] = {}
185
+
186
+ if config not in combined[model]:
187
+ combined[model][config] = {}
188
+
189
+ if commit not in combined[model][config]:
190
+ combined[model][config][commit] = {"metrics": summary["metrics"]}
191
+
192
+ with open(os.path.join(exp_run_dir, "summary.json"), "w") as fp:
193
+ json.dump(combined, fp, indent=4)
194
+
195
+ print(json.dumps(combined, indent=4))
196
+
197
+ return combined
198
+
199
+
200
+ if __name__ == "__main__":
201
+
202
+ def list_str(values):
203
+ return values.split(",")
204
+
205
+ parser = argparse.ArgumentParser()
206
+
207
+ parser.add_argument("--config-dir", type=str, required=True, help="The path to the config directory.")
208
+ parser.add_argument("--config-name", type=str, required=True, help="The config name.")
209
+
210
+ # arguments specific to this wrapper for our own customization
211
+ parser.add_argument("--ensure_empty", type=bool, default=True, help="If to create a temporary directory.")
212
+ parser.add_argument(
213
+ "--commit",
214
+ type=list_str,
215
+ default="",
216
+ help="Comma-separated list of branch names and/or commit sha values on which the benchmark will run. If `diff` is specified, it will run on both the current head and the `main` branch.",
217
+ )
218
+ parser.add_argument("--metrics", type=str, help="The metrics to be included in the summary.")
219
+
220
+ parser.add_argument("--repo_id", type=str, default=None, help="The repository to which the file will be uploaded.")
221
+ parser.add_argument("--path_in_repo", type=str, default=None, help="Relative filepath in the repo.")
222
+ parser.add_argument("--token", type=str, default=None, help="A valid user access token (string).")
223
+
224
+ args, optimum_benchmark_args = parser.parse_known_args()
225
+
226
+ repo = Repo(PATH_TO_REPO)
227
+
228
+ metrics = [
229
+ "prefill.latency.mean",
230
+ "prefill.throughput.value",
231
+ "decode.latency.mean",
232
+ "decode.throughput.value",
233
+ "per_token.latency.mean",
234
+ "per_token.throughput.value",
235
+ ]
236
+ if args.metrics is not None:
237
+ metrics = args.metrics.split(",")
238
+
239
+ # Get `backend.model` in a hacky way: We want to control the experiment flow manually.
240
+ models = [""]
241
+ for idx, arg in enumerate(optimum_benchmark_args):
242
+ if arg.startswith("backend.model="):
243
+ models = arg[len("backend.model=") :]
244
+ models = models.split(",")
245
+ break
246
+ optimum_benchmark_args = [arg for arg in optimum_benchmark_args if not arg.startswith("backend.model=")]
247
+
248
+ # Get the commit(s)
249
+ current_head = str(repo.head.commit) if repo.head.is_detached else str(repo.head.ref)
250
+ commits = [x for x in args.commit if x != ""]
251
+ if len(commits) == 0:
252
+ commits = [current_head]
253
+ elif len(commits) == 1 and commits[0] == "diff":
254
+ # compare to `main`
255
+ commits = ["main", current_head]
256
+
257
+ # Get the specified run directory
258
+ run_dir_arg_idx, run_dir = -1, None
259
+ sweep_dir_arg_idx, sweep_dir = -1, None
260
+ for idx, arg in enumerate(optimum_benchmark_args):
261
+ if arg.startswith("hydra.run.dir="):
262
+ run_dir = arg[len("hydra.run.dir=") :]
263
+ run_dir_arg_idx = idx
264
+ elif arg.startswith("hydra.sweep.dir="):
265
+ sweep_dir = arg[len("hydra.sweep.dir=") :]
266
+ sweep_dir_arg_idx = idx
267
+ exp_run_dir, arg_dix, arg_name = (
268
+ (sweep_dir, sweep_dir_arg_idx, "hydra.sweep.dir")
269
+ if "--multirun" in optimum_benchmark_args
270
+ else (run_dir, run_dir_arg_idx, "hydra.run.dir")
271
+ )
272
+
273
+ # TODO: not hardcoded
274
+ if exp_run_dir is None and args.ensure_empty:
275
+ exp_run_dir = "_benchmark"
276
+
277
+ if args.ensure_empty:
278
+ os.makedirs(exp_run_dir, exist_ok=True)
279
+ exp_run_dir = tempfile.mkdtemp(dir=exp_run_dir)
280
+
281
+ run_summaries = []
282
+ for commit in commits:
283
+ with checkout_commit(repo, commit):
284
+ commit = str(repo.head.commit)
285
+
286
+ commit_run_dir = exp_run_dir
287
+ if exp_run_dir is not None:
288
+ commit_run_dir = os.path.join(exp_run_dir, rf"commit\={commit}")
289
+
290
+ print(f"Run benchmark on commit: {commit}")
291
+
292
+ for model in models:
293
+ model_arg = [f"backend.model={model}"] if model != "" else []
294
+ dir_args = []
295
+ if commit_run_dir is not None:
296
+ if arg_dix > -1:
297
+ optimum_benchmark_args[arg_dix] = f"{arg_name}={commit_run_dir}"
298
+ else:
299
+ dir_args = [
300
+ f"hydra.sweep.dir={commit_run_dir}",
301
+ f"hydra.run.dir={commit_run_dir}/" + "${hydra.job.override_dirname}",
302
+ ]
303
+ main(args.config_dir, args.config_name, model_arg + dir_args + optimum_benchmark_args)
304
+
305
+ if commit_run_dir is not None:
306
+ # Need to remove the `\` character
307
+ summaries = summarize(commit_run_dir.replace("\\", ""), metrics)
308
+ run_summaries.extend(summaries)
309
+
310
+ # aggregate the information across the commits
311
+ if exp_run_dir is not None:
312
+ with open(os.path.join(exp_run_dir, "summaries.json"), "w") as fp:
313
+ json.dump(run_summaries, fp, indent=4)
314
+
315
+ combined_summary = combine_summaries(run_summaries)
316
+
317
+ if args.repo_id is not None and args.path_in_repo is not None:
318
+ # Upload to Hub
319
+ api = HfApi()
320
+ api.upload_folder(
321
+ folder_path=exp_run_dir,
322
+ path_in_repo=args.path_in_repo,
323
+ repo_id=args.repo_id,
324
+ repo_type="dataset",
325
+ token=args.token,
326
+ )
docs/transformers/benchmark/benchmarks_entrypoint.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import importlib.util
3
+ import logging
4
+ import os
5
+ from typing import Dict
6
+ import sys
7
+
8
+ from psycopg2.extras import Json
9
+ from psycopg2.extensions import register_adapter
10
+
11
+
12
+ register_adapter(dict, Json)
13
+
14
+
15
+ class ImportModuleException(Exception):
16
+ pass
17
+
18
+
19
+ class MetricsRecorder:
20
+ def __init__(self, connection, logger: logging.Logger, branch: str, commit_id: str, commit_msg: str):
21
+ self.conn = connection
22
+ self.conn.autocommit = True
23
+ self.logger = logger
24
+ self.branch = branch
25
+ self.commit_id = commit_id
26
+ self.commit_msg = commit_msg
27
+
28
+ def initialise_benchmark(self, metadata: Dict[str, str]) -> int:
29
+ """
30
+ Creates a new benchmark, returns the benchmark id
31
+ """
32
+ # gpu_name: str, model_id: str
33
+ with self.conn.cursor() as cur:
34
+ cur.execute(
35
+ "INSERT INTO benchmarks (branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s) RETURNING benchmark_id",
36
+ (self.branch, self.commit_id, self.commit_msg, metadata),
37
+ )
38
+ benchmark_id = cur.fetchone()[0]
39
+ logger.debug(f"initialised benchmark #{benchmark_id}")
40
+ return benchmark_id
41
+
42
+ def collect_device_measurements(self, benchmark_id: int, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes):
43
+ """
44
+ Collect device metrics, such as CPU & GPU usage. These are "static", as in you cannot pass arbitrary arguments to the function.
45
+ """
46
+ with self.conn.cursor() as cur:
47
+ cur.execute(
48
+ "INSERT INTO device_measurements (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes) VALUES (%s, %s, %s, %s, %s)",
49
+ (benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes),
50
+ )
51
+ self.logger.debug(
52
+ f"inserted device measurements for benchmark #{benchmark_id} [CPU util: {cpu_util}, mem MBs: {mem_megabytes}, GPU util: {gpu_util}, GPU mem MBs: {gpu_mem_megabytes}]"
53
+ )
54
+
55
+ def collect_model_measurements(self, benchmark_id: int, measurements: Dict[str, float]):
56
+ with self.conn.cursor() as cur:
57
+ cur.execute(
58
+ """
59
+ INSERT INTO model_measurements (
60
+ benchmark_id,
61
+ measurements
62
+ ) VALUES (%s, %s)
63
+ """,
64
+ (
65
+ benchmark_id,
66
+ measurements,
67
+ ),
68
+ )
69
+ self.logger.debug(f"inserted model measurements for benchmark #{benchmark_id}: {measurements}")
70
+
71
+ def close(self):
72
+ self.conn.close()
73
+
74
+
75
+ logger = logging.getLogger(__name__)
76
+ logger.setLevel(logging.INFO)
77
+
78
+ handler = logging.StreamHandler(sys.stdout)
79
+ handler.setLevel(logging.INFO)
80
+ formatter = logging.Formatter("[%(levelname)s - %(asctime)s] %(message)s")
81
+ handler.setFormatter(formatter)
82
+ logger.addHandler(handler)
83
+
84
+
85
+ def parse_arguments():
86
+ """
87
+ Parse command line arguments for the benchmarking CLI.
88
+ """
89
+ parser = argparse.ArgumentParser(description="CLI for benchmarking the huggingface/transformers.")
90
+
91
+ parser.add_argument(
92
+ "branch",
93
+ type=str,
94
+ help="The branch name on which the benchmarking is performed.",
95
+ )
96
+
97
+ parser.add_argument(
98
+ "commit_id",
99
+ type=str,
100
+ help="The commit hash on which the benchmarking is performed.",
101
+ )
102
+
103
+ parser.add_argument(
104
+ "commit_msg",
105
+ type=str,
106
+ help="The commit message associated with the commit, truncated to 70 characters.",
107
+ )
108
+
109
+ args = parser.parse_args()
110
+
111
+ return args.branch, args.commit_id, args.commit_msg
112
+
113
+
114
+ def import_from_path(module_name, file_path):
115
+ try:
116
+ spec = importlib.util.spec_from_file_location(module_name, file_path)
117
+ module = importlib.util.module_from_spec(spec)
118
+ sys.modules[module_name] = module
119
+ spec.loader.exec_module(module)
120
+ return module
121
+ except Exception as e:
122
+ raise ImportModuleException(f"failed to load python module: {e}")
123
+
124
+
125
+ if __name__ == "__main__":
126
+ benchmarks_folder_path = os.path.dirname(os.path.realpath(__file__))
127
+
128
+ branch, commit_id, commit_msg = parse_arguments()
129
+
130
+ for entry in os.scandir(benchmarks_folder_path):
131
+ try:
132
+ if not entry.name.endswith(".py"):
133
+ continue
134
+ if entry.path == __file__:
135
+ continue
136
+ logger.debug(f"loading: {entry.name}")
137
+ module = import_from_path(entry.name.split(".")[0], entry.path)
138
+ logger.info(f"running benchmarks in: {entry.name}")
139
+ module.run_benchmark(logger, branch, commit_id, commit_msg)
140
+ except ImportModuleException as e:
141
+ logger.error(e)
142
+ except Exception as e:
143
+ logger.error(f"error running benchmarks for {entry.name}: {e}")
docs/transformers/benchmark/config/generation.yaml ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ defaults:
2
+ - benchmark # inheriting benchmark schema
3
+ - scenario: inference
4
+ - launcher: process
5
+ - backend: pytorch
6
+ - _self_ # for hydra 1.1 compatibility
7
+
8
+ name: pytorch_generate
9
+
10
+ launcher:
11
+ start_method: spawn
12
+ device_isolation: true
13
+ device_isolation_action: warn
14
+
15
+ backend:
16
+ device: cuda
17
+ device_ids: 0
18
+ no_weights: true
19
+ model: meta-llama/Llama-2-7b-hf
20
+ cache_implementation: static
21
+ torch_compile: true
22
+ torch_dtype: float16
23
+ torch_compile_config:
24
+ backend: inductor
25
+ mode: reduce-overhead
26
+ fullgraph: true
27
+
28
+ scenario:
29
+ input_shapes:
30
+ batch_size: 1
31
+ sequence_length: 7
32
+ generate_kwargs:
33
+ max_new_tokens: 128
34
+ min_new_tokens: 128
35
+ do_sample: false
36
+ memory: true
37
+ latency: true
38
+ iterations: 2
39
+ duration: 0
40
+
41
+
42
+ # hydra/cli specific settings
43
+ hydra:
44
+ run:
45
+ # where to store run results
46
+ dir: runs/${name}
47
+ job:
48
+ # change working directory to the run directory
49
+ chdir: true
50
+ env_set:
51
+ # set environment variable OVERRIDE_BENCHMARKS to 1
52
+ # to not skip benchmarks that have been run before
53
+ OVERRIDE_BENCHMARKS: 1
54
+ LOG_LEVEL: WARN
55
+ sweep:
56
+ dir: multirun
57
+ subdir: ${hydra.job.override_dirname}
docs/transformers/benchmark/default.yml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ apiVersion: 1
2
+
3
+ providers:
4
+ - name: 'Transformers Benchmarks'
5
+ orgId: 1
6
+ type: file
7
+ updateIntervalSeconds: 10
8
+ allowUiUpdates: true
9
+ options:
10
+ path: /etc/grafana/dashboards
docs/transformers/benchmark/grafana_dashboard.json ADDED
@@ -0,0 +1,2375 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "annotations": {
3
+ "list": [
4
+ {
5
+ "builtIn": 1,
6
+ "datasource": {
7
+ "type": "grafana",
8
+ "uid": "-- Grafana --"
9
+ },
10
+ "enable": true,
11
+ "hide": true,
12
+ "iconColor": "rgba(0, 211, 255, 1)",
13
+ "name": "Annotations & Alerts",
14
+ "type": "dashboard"
15
+ }
16
+ ]
17
+ },
18
+ "editable": true,
19
+ "fiscalYearStartMonth": 0,
20
+ "graphTooltip": 0,
21
+ "id": 1,
22
+ "links": [
23
+ {
24
+ "asDropdown": false,
25
+ "icon": "external link",
26
+ "includeVars": false,
27
+ "keepTime": false,
28
+ "tags": [],
29
+ "targetBlank": false,
30
+ "title": "Go to data",
31
+ "tooltip": "Go to data",
32
+ "type": "link",
33
+ "url": "http://transformers-benchmarks.hf.co/d/fdz33iyzln9c0a/transformers-benchmarks?orgId=1&from=${StartTime}&to=${EndTime}"
34
+ }
35
+ ],
36
+ "liveNow": true,
37
+ "panels": [
38
+ {
39
+ "datasource": {
40
+ "default": true,
41
+ "type": "grafana-postgresql-datasource",
42
+ "uid": "be28nkzirtb0gd"
43
+ },
44
+ "fieldConfig": {
45
+ "defaults": {
46
+ "color": {
47
+ "mode": "thresholds"
48
+ },
49
+ "custom": {
50
+ "align": "left",
51
+ "cellOptions": {
52
+ "type": "auto"
53
+ },
54
+ "inspect": false
55
+ },
56
+ "mappings": [],
57
+ "thresholds": {
58
+ "mode": "absolute",
59
+ "steps": [
60
+ {
61
+ "color": "green",
62
+ "value": null
63
+ },
64
+ {
65
+ "color": "red",
66
+ "value": 80
67
+ }
68
+ ]
69
+ }
70
+ },
71
+ "overrides": [
72
+ {
73
+ "matcher": {
74
+ "id": "byName",
75
+ "options": "gpu_name"
76
+ },
77
+ "properties": [
78
+ {
79
+ "id": "custom.width",
80
+ "value": 202
81
+ }
82
+ ]
83
+ },
84
+ {
85
+ "matcher": {
86
+ "id": "byName",
87
+ "options": "left"
88
+ },
89
+ "properties": [
90
+ {
91
+ "id": "custom.width",
92
+ "value": 407
93
+ }
94
+ ]
95
+ },
96
+ {
97
+ "matcher": {
98
+ "id": "byName",
99
+ "options": "commit_message"
100
+ },
101
+ "properties": [
102
+ {
103
+ "id": "custom.width",
104
+ "value": 524
105
+ }
106
+ ]
107
+ },
108
+ {
109
+ "matcher": {
110
+ "id": "byName",
111
+ "options": "commit_id"
112
+ },
113
+ "properties": [
114
+ {
115
+ "id": "custom.width",
116
+ "value": 353
117
+ }
118
+ ]
119
+ },
120
+ {
121
+ "matcher": {
122
+ "id": "byName",
123
+ "options": "model_id"
124
+ },
125
+ "properties": [
126
+ {
127
+ "id": "custom.width",
128
+ "value": 216
129
+ }
130
+ ]
131
+ }
132
+ ]
133
+ },
134
+ "gridPos": {
135
+ "h": 6,
136
+ "w": 24,
137
+ "x": 0,
138
+ "y": 0
139
+ },
140
+ "id": 5,
141
+ "options": {
142
+ "cellHeight": "sm",
143
+ "footer": {
144
+ "countRows": false,
145
+ "fields": "",
146
+ "reducer": [
147
+ "sum"
148
+ ],
149
+ "show": false
150
+ },
151
+ "showHeader": true,
152
+ "sortBy": []
153
+ },
154
+ "pluginVersion": "11.2.2",
155
+ "targets": [
156
+ {
157
+ "datasource": {
158
+ "default": true,
159
+ "type": "grafana-postgresql-datasource",
160
+ "uid": "be28nkzirtb0gd"
161
+ },
162
+ "editorMode": "code",
163
+ "format": "table",
164
+ "rawQuery": true,
165
+ "rawSql": "SELECT commit_id, commit_message, metadata->>'gpu_name' as gpu_name, metadata->>'model_id' as model_id, created_at AS date FROM benchmarks WHERE branch = '${branch}' AND metadata->>'gpu_name' = '${gpu_name}' ORDER BY benchmark_id DESC LIMIT ${last_n_commits};",
166
+ "refId": "A",
167
+ "sql": {
168
+ "columns": [
169
+ {
170
+ "parameters": [
171
+ {
172
+ "name": "commit_id",
173
+ "type": "functionParameter"
174
+ }
175
+ ],
176
+ "type": "function"
177
+ },
178
+ {
179
+ "parameters": [
180
+ {
181
+ "name": "gpu_name",
182
+ "type": "functionParameter"
183
+ }
184
+ ],
185
+ "type": "function"
186
+ }
187
+ ],
188
+ "groupBy": [
189
+ {
190
+ "property": {
191
+ "type": "string"
192
+ },
193
+ "type": "groupBy"
194
+ }
195
+ ],
196
+ "limit": 50,
197
+ "whereJsonTree": {
198
+ "children1": [
199
+ {
200
+ "id": "baaa8aaa-89ab-4cde-b012-31922f96de3f",
201
+ "properties": {
202
+ "field": "commit_id",
203
+ "fieldSrc": "field",
204
+ "operator": "equal",
205
+ "value": [
206
+ "${commit}"
207
+ ],
208
+ "valueError": [
209
+ null
210
+ ],
211
+ "valueSrc": [
212
+ "value"
213
+ ],
214
+ "valueType": [
215
+ "text"
216
+ ]
217
+ },
218
+ "type": "rule"
219
+ }
220
+ ],
221
+ "id": "bab88a98-0123-4456-b89a-b1922f7d4f11",
222
+ "type": "group"
223
+ },
224
+ "whereString": "commit_id = '${commit}'"
225
+ },
226
+ "table": "benchmarks"
227
+ }
228
+ ],
229
+ "transparent": true,
230
+ "type": "table"
231
+ },
232
+ {
233
+ "collapsed": false,
234
+ "gridPos": {
235
+ "h": 1,
236
+ "w": 24,
237
+ "x": 0,
238
+ "y": 6
239
+ },
240
+ "id": 13,
241
+ "panels": [],
242
+ "title": "Eager Forward Pass",
243
+ "type": "row"
244
+ },
245
+ {
246
+ "datasource": {
247
+ "default": true,
248
+ "type": "grafana-postgresql-datasource",
249
+ "uid": "be28nkzirtb0gd"
250
+ },
251
+ "fieldConfig": {
252
+ "defaults": {
253
+ "color": {
254
+ "mode": "continuous-YlBl"
255
+ },
256
+ "custom": {
257
+ "axisBorderShow": false,
258
+ "axisCenteredZero": false,
259
+ "axisColorMode": "text",
260
+ "axisLabel": "",
261
+ "axisPlacement": "auto",
262
+ "fillOpacity": 80,
263
+ "gradientMode": "scheme",
264
+ "hideFrom": {
265
+ "legend": false,
266
+ "tooltip": false,
267
+ "viz": false
268
+ },
269
+ "lineWidth": 0,
270
+ "scaleDistribution": {
271
+ "type": "linear"
272
+ },
273
+ "thresholdsStyle": {
274
+ "mode": "off"
275
+ }
276
+ },
277
+ "mappings": [],
278
+ "thresholds": {
279
+ "mode": "absolute",
280
+ "steps": [
281
+ {
282
+ "color": "green",
283
+ "value": null
284
+ }
285
+ ]
286
+ },
287
+ "unit": "s"
288
+ },
289
+ "overrides": []
290
+ },
291
+ "gridPos": {
292
+ "h": 11,
293
+ "w": 12,
294
+ "x": 0,
295
+ "y": 7
296
+ },
297
+ "id": 7,
298
+ "options": {
299
+ "barRadius": 0.05,
300
+ "barWidth": 0.8,
301
+ "fullHighlight": false,
302
+ "groupWidth": 0.7,
303
+ "legend": {
304
+ "calcs": [],
305
+ "displayMode": "list",
306
+ "placement": "bottom",
307
+ "showLegend": false
308
+ },
309
+ "orientation": "auto",
310
+ "showValue": "auto",
311
+ "stacking": "none",
312
+ "tooltip": {
313
+ "mode": "single",
314
+ "sort": "none"
315
+ },
316
+ "xTickLabelRotation": 0,
317
+ "xTickLabelSpacing": 0
318
+ },
319
+ "pluginVersion": "11.2.2",
320
+ "targets": [
321
+ {
322
+ "datasource": {
323
+ "default": true,
324
+ "type": "grafana-postgresql-datasource",
325
+ "uid": "be28nkzirtb0gd"
326
+ },
327
+ "editorMode": "code",
328
+ "format": "table",
329
+ "rawQuery": true,
330
+ "rawSql": "SELECT CAST(m.measurements->'first_eager_forward_pass_time_secs' AS double precision) AS first_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
331
+ "refId": "A",
332
+ "sql": {
333
+ "columns": [
334
+ {
335
+ "parameters": [],
336
+ "type": "function"
337
+ }
338
+ ],
339
+ "groupBy": [
340
+ {
341
+ "property": {
342
+ "type": "string"
343
+ },
344
+ "type": "groupBy"
345
+ }
346
+ ],
347
+ "limit": 50
348
+ }
349
+ }
350
+ ],
351
+ "title": "First eager forward pass",
352
+ "transformations": [
353
+ {
354
+ "id": "sortBy",
355
+ "options": {
356
+ "fields": {},
357
+ "sort": [
358
+ {
359
+ "field": "time"
360
+ }
361
+ ]
362
+ }
363
+ }
364
+ ],
365
+ "transparent": true,
366
+ "type": "barchart"
367
+ },
368
+ {
369
+ "datasource": {
370
+ "default": true,
371
+ "type": "grafana-postgresql-datasource",
372
+ "uid": "be28nkzirtb0gd"
373
+ },
374
+ "fieldConfig": {
375
+ "defaults": {
376
+ "color": {
377
+ "mode": "continuous-YlBl"
378
+ },
379
+ "custom": {
380
+ "axisBorderShow": false,
381
+ "axisCenteredZero": false,
382
+ "axisColorMode": "text",
383
+ "axisLabel": "",
384
+ "axisPlacement": "auto",
385
+ "fillOpacity": 80,
386
+ "gradientMode": "scheme",
387
+ "hideFrom": {
388
+ "legend": false,
389
+ "tooltip": false,
390
+ "viz": false
391
+ },
392
+ "lineWidth": 0,
393
+ "scaleDistribution": {
394
+ "type": "linear"
395
+ },
396
+ "thresholdsStyle": {
397
+ "mode": "off"
398
+ }
399
+ },
400
+ "mappings": [],
401
+ "thresholds": {
402
+ "mode": "absolute",
403
+ "steps": [
404
+ {
405
+ "color": "green",
406
+ "value": null
407
+ },
408
+ {
409
+ "color": "red",
410
+ "value": 80
411
+ }
412
+ ]
413
+ },
414
+ "unit": "s"
415
+ },
416
+ "overrides": []
417
+ },
418
+ "gridPos": {
419
+ "h": 11,
420
+ "w": 12,
421
+ "x": 12,
422
+ "y": 7
423
+ },
424
+ "id": 9,
425
+ "options": {
426
+ "barRadius": 0.05,
427
+ "barWidth": 0.8,
428
+ "fullHighlight": false,
429
+ "groupWidth": 0.7,
430
+ "legend": {
431
+ "calcs": [],
432
+ "displayMode": "list",
433
+ "placement": "bottom",
434
+ "showLegend": false
435
+ },
436
+ "orientation": "auto",
437
+ "showValue": "auto",
438
+ "stacking": "none",
439
+ "tooltip": {
440
+ "mode": "single",
441
+ "sort": "none"
442
+ },
443
+ "xTickLabelRotation": 0,
444
+ "xTickLabelSpacing": 0
445
+ },
446
+ "targets": [
447
+ {
448
+ "datasource": {
449
+ "default": true,
450
+ "type": "grafana-postgresql-datasource",
451
+ "uid": "be28nkzirtb0gd"
452
+ },
453
+ "editorMode": "code",
454
+ "format": "table",
455
+ "rawQuery": true,
456
+ "rawSql": "SELECT CAST(m.measurements->'second_eager_forward_pass_time_secs' AS double precision) AS second_eager_forward_pass_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
457
+ "refId": "A",
458
+ "sql": {
459
+ "columns": [
460
+ {
461
+ "parameters": [],
462
+ "type": "function"
463
+ }
464
+ ],
465
+ "groupBy": [
466
+ {
467
+ "property": {
468
+ "type": "string"
469
+ },
470
+ "type": "groupBy"
471
+ }
472
+ ],
473
+ "limit": 50
474
+ }
475
+ }
476
+ ],
477
+ "title": "Second eager forward pass",
478
+ "transformations": [
479
+ {
480
+ "id": "sortBy",
481
+ "options": {
482
+ "fields": {},
483
+ "sort": [
484
+ {
485
+ "field": "time"
486
+ }
487
+ ]
488
+ }
489
+ }
490
+ ],
491
+ "transparent": true,
492
+ "type": "barchart"
493
+ },
494
+ {
495
+ "collapsed": false,
496
+ "gridPos": {
497
+ "h": 1,
498
+ "w": 24,
499
+ "x": 0,
500
+ "y": 18
501
+ },
502
+ "id": 16,
503
+ "panels": [],
504
+ "title": "Time to next token",
505
+ "type": "row"
506
+ },
507
+ {
508
+ "datasource": {
509
+ "default": true,
510
+ "type": "grafana-postgresql-datasource",
511
+ "uid": "be28nkzirtb0gd"
512
+ },
513
+ "fieldConfig": {
514
+ "defaults": {
515
+ "color": {
516
+ "mode": "continuous-YlBl"
517
+ },
518
+ "custom": {
519
+ "axisBorderShow": false,
520
+ "axisCenteredZero": false,
521
+ "axisColorMode": "text",
522
+ "axisLabel": "",
523
+ "axisPlacement": "auto",
524
+ "fillOpacity": 80,
525
+ "gradientMode": "scheme",
526
+ "hideFrom": {
527
+ "legend": false,
528
+ "tooltip": false,
529
+ "viz": false
530
+ },
531
+ "lineWidth": 0,
532
+ "scaleDistribution": {
533
+ "type": "linear"
534
+ },
535
+ "thresholdsStyle": {
536
+ "mode": "off"
537
+ }
538
+ },
539
+ "mappings": [],
540
+ "thresholds": {
541
+ "mode": "absolute",
542
+ "steps": [
543
+ {
544
+ "color": "green",
545
+ "value": null
546
+ }
547
+ ]
548
+ },
549
+ "unit": "s"
550
+ },
551
+ "overrides": []
552
+ },
553
+ "gridPos": {
554
+ "h": 11,
555
+ "w": 12,
556
+ "x": 0,
557
+ "y": 19
558
+ },
559
+ "id": 17,
560
+ "options": {
561
+ "barRadius": 0.05,
562
+ "barWidth": 0.8,
563
+ "fullHighlight": false,
564
+ "groupWidth": 0.7,
565
+ "legend": {
566
+ "calcs": [],
567
+ "displayMode": "list",
568
+ "placement": "bottom",
569
+ "showLegend": false
570
+ },
571
+ "orientation": "auto",
572
+ "showValue": "always",
573
+ "stacking": "none",
574
+ "tooltip": {
575
+ "mode": "single",
576
+ "sort": "none"
577
+ },
578
+ "xTickLabelRotation": 0,
579
+ "xTickLabelSpacing": 0
580
+ },
581
+ "targets": [
582
+ {
583
+ "datasource": {
584
+ "default": true,
585
+ "type": "grafana-postgresql-datasource",
586
+ "uid": "be28nkzirtb0gd"
587
+ },
588
+ "editorMode": "code",
589
+ "format": "table",
590
+ "rawQuery": true,
591
+ "rawSql": "SELECT CAST(m.measurements->'time_to_first_token_secs' AS double precision) AS time_to_first_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
592
+ "refId": "A",
593
+ "sql": {
594
+ "columns": [
595
+ {
596
+ "parameters": [],
597
+ "type": "function"
598
+ }
599
+ ],
600
+ "groupBy": [
601
+ {
602
+ "property": {
603
+ "type": "string"
604
+ },
605
+ "type": "groupBy"
606
+ }
607
+ ],
608
+ "limit": 50
609
+ }
610
+ }
611
+ ],
612
+ "title": "Time to first token",
613
+ "transformations": [
614
+ {
615
+ "id": "sortBy",
616
+ "options": {
617
+ "fields": {},
618
+ "sort": [
619
+ {
620
+ "field": "time"
621
+ }
622
+ ]
623
+ }
624
+ }
625
+ ],
626
+ "transparent": true,
627
+ "type": "barchart"
628
+ },
629
+ {
630
+ "datasource": {
631
+ "default": true,
632
+ "type": "grafana-postgresql-datasource",
633
+ "uid": "be28nkzirtb0gd"
634
+ },
635
+ "fieldConfig": {
636
+ "defaults": {
637
+ "color": {
638
+ "mode": "continuous-YlBl"
639
+ },
640
+ "custom": {
641
+ "axisBorderShow": false,
642
+ "axisCenteredZero": false,
643
+ "axisColorMode": "text",
644
+ "axisLabel": "",
645
+ "axisPlacement": "auto",
646
+ "fillOpacity": 80,
647
+ "gradientMode": "scheme",
648
+ "hideFrom": {
649
+ "legend": false,
650
+ "tooltip": false,
651
+ "viz": false
652
+ },
653
+ "lineWidth": 0,
654
+ "scaleDistribution": {
655
+ "type": "linear"
656
+ },
657
+ "thresholdsStyle": {
658
+ "mode": "off"
659
+ }
660
+ },
661
+ "mappings": [],
662
+ "thresholds": {
663
+ "mode": "absolute",
664
+ "steps": [
665
+ {
666
+ "color": "green",
667
+ "value": null
668
+ }
669
+ ]
670
+ },
671
+ "unit": "s"
672
+ },
673
+ "overrides": []
674
+ },
675
+ "gridPos": {
676
+ "h": 11,
677
+ "w": 12,
678
+ "x": 12,
679
+ "y": 19
680
+ },
681
+ "id": 18,
682
+ "options": {
683
+ "barRadius": 0.05,
684
+ "barWidth": 0.8,
685
+ "fullHighlight": false,
686
+ "groupWidth": 0.7,
687
+ "legend": {
688
+ "calcs": [],
689
+ "displayMode": "list",
690
+ "placement": "bottom",
691
+ "showLegend": false
692
+ },
693
+ "orientation": "auto",
694
+ "showValue": "always",
695
+ "stacking": "none",
696
+ "tooltip": {
697
+ "mode": "single",
698
+ "sort": "none"
699
+ },
700
+ "xTickLabelRotation": 0,
701
+ "xTickLabelSpacing": 0
702
+ },
703
+ "targets": [
704
+ {
705
+ "datasource": {
706
+ "default": true,
707
+ "type": "grafana-postgresql-datasource",
708
+ "uid": "be28nkzirtb0gd"
709
+ },
710
+ "editorMode": "code",
711
+ "format": "table",
712
+ "rawQuery": true,
713
+ "rawSql": "SELECT CAST(m.measurements->'time_to_second_token_secs' AS double precision) AS time_to_second_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
714
+ "refId": "A",
715
+ "sql": {
716
+ "columns": [
717
+ {
718
+ "parameters": [],
719
+ "type": "function"
720
+ }
721
+ ],
722
+ "groupBy": [
723
+ {
724
+ "property": {
725
+ "type": "string"
726
+ },
727
+ "type": "groupBy"
728
+ }
729
+ ],
730
+ "limit": 50
731
+ }
732
+ }
733
+ ],
734
+ "title": "Time to second token",
735
+ "transformations": [
736
+ {
737
+ "id": "sortBy",
738
+ "options": {
739
+ "fields": {},
740
+ "sort": [
741
+ {
742
+ "field": "time"
743
+ }
744
+ ]
745
+ }
746
+ }
747
+ ],
748
+ "transparent": true,
749
+ "type": "barchart"
750
+ },
751
+ {
752
+ "datasource": {
753
+ "default": true,
754
+ "type": "grafana-postgresql-datasource",
755
+ "uid": "be28nkzirtb0gd"
756
+ },
757
+ "fieldConfig": {
758
+ "defaults": {
759
+ "color": {
760
+ "mode": "continuous-YlBl"
761
+ },
762
+ "custom": {
763
+ "axisBorderShow": false,
764
+ "axisCenteredZero": false,
765
+ "axisColorMode": "text",
766
+ "axisLabel": "",
767
+ "axisPlacement": "auto",
768
+ "fillOpacity": 80,
769
+ "gradientMode": "scheme",
770
+ "hideFrom": {
771
+ "legend": false,
772
+ "tooltip": false,
773
+ "viz": false
774
+ },
775
+ "lineWidth": 0,
776
+ "scaleDistribution": {
777
+ "type": "linear"
778
+ },
779
+ "thresholdsStyle": {
780
+ "mode": "off"
781
+ }
782
+ },
783
+ "mappings": [],
784
+ "thresholds": {
785
+ "mode": "absolute",
786
+ "steps": [
787
+ {
788
+ "color": "green",
789
+ "value": null
790
+ }
791
+ ]
792
+ },
793
+ "unit": "s"
794
+ },
795
+ "overrides": []
796
+ },
797
+ "gridPos": {
798
+ "h": 11,
799
+ "w": 12,
800
+ "x": 0,
801
+ "y": 30
802
+ },
803
+ "id": 19,
804
+ "options": {
805
+ "barRadius": 0.05,
806
+ "barWidth": 0.8,
807
+ "fullHighlight": false,
808
+ "groupWidth": 0.7,
809
+ "legend": {
810
+ "calcs": [],
811
+ "displayMode": "list",
812
+ "placement": "bottom",
813
+ "showLegend": false
814
+ },
815
+ "orientation": "auto",
816
+ "showValue": "always",
817
+ "stacking": "none",
818
+ "tooltip": {
819
+ "mode": "single",
820
+ "sort": "none"
821
+ },
822
+ "xTickLabelRotation": 0,
823
+ "xTickLabelSpacing": 0
824
+ },
825
+ "targets": [
826
+ {
827
+ "datasource": {
828
+ "default": true,
829
+ "type": "grafana-postgresql-datasource",
830
+ "uid": "be28nkzirtb0gd"
831
+ },
832
+ "editorMode": "code",
833
+ "format": "table",
834
+ "rawQuery": true,
835
+ "rawSql": "SELECT CAST(m.measurements->'time_to_third_token_secs' AS double precision) AS time_to_third_token_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
836
+ "refId": "A",
837
+ "sql": {
838
+ "columns": [
839
+ {
840
+ "parameters": [],
841
+ "type": "function"
842
+ }
843
+ ],
844
+ "groupBy": [
845
+ {
846
+ "property": {
847
+ "type": "string"
848
+ },
849
+ "type": "groupBy"
850
+ }
851
+ ],
852
+ "limit": 50
853
+ }
854
+ }
855
+ ],
856
+ "title": "Time to third token",
857
+ "transformations": [
858
+ {
859
+ "id": "sortBy",
860
+ "options": {
861
+ "fields": {},
862
+ "sort": [
863
+ {
864
+ "field": "time"
865
+ }
866
+ ]
867
+ }
868
+ }
869
+ ],
870
+ "transparent": true,
871
+ "type": "barchart"
872
+ },
873
+ {
874
+ "datasource": {
875
+ "default": true,
876
+ "type": "grafana-postgresql-datasource",
877
+ "uid": "be28nkzirtb0gd"
878
+ },
879
+ "fieldConfig": {
880
+ "defaults": {
881
+ "color": {
882
+ "mode": "continuous-YlBl"
883
+ },
884
+ "custom": {
885
+ "axisBorderShow": false,
886
+ "axisCenteredZero": false,
887
+ "axisColorMode": "text",
888
+ "axisLabel": "",
889
+ "axisPlacement": "auto",
890
+ "fillOpacity": 80,
891
+ "gradientMode": "scheme",
892
+ "hideFrom": {
893
+ "legend": false,
894
+ "tooltip": false,
895
+ "viz": false
896
+ },
897
+ "lineWidth": 0,
898
+ "scaleDistribution": {
899
+ "type": "linear"
900
+ },
901
+ "thresholdsStyle": {
902
+ "mode": "off"
903
+ }
904
+ },
905
+ "mappings": [],
906
+ "thresholds": {
907
+ "mode": "absolute",
908
+ "steps": [
909
+ {
910
+ "color": "green",
911
+ "value": null
912
+ }
913
+ ]
914
+ },
915
+ "unit": "s"
916
+ },
917
+ "overrides": []
918
+ },
919
+ "gridPos": {
920
+ "h": 11,
921
+ "w": 12,
922
+ "x": 12,
923
+ "y": 30
924
+ },
925
+ "id": 20,
926
+ "options": {
927
+ "barRadius": 0.05,
928
+ "barWidth": 0.8,
929
+ "fullHighlight": false,
930
+ "groupWidth": 0.7,
931
+ "legend": {
932
+ "calcs": [],
933
+ "displayMode": "list",
934
+ "placement": "bottom",
935
+ "showLegend": false
936
+ },
937
+ "orientation": "auto",
938
+ "showValue": "always",
939
+ "stacking": "none",
940
+ "tooltip": {
941
+ "mode": "single",
942
+ "sort": "none"
943
+ },
944
+ "xTickLabelRotation": 0,
945
+ "xTickLabelSpacing": 0
946
+ },
947
+ "targets": [
948
+ {
949
+ "datasource": {
950
+ "default": true,
951
+ "type": "grafana-postgresql-datasource",
952
+ "uid": "be28nkzirtb0gd"
953
+ },
954
+ "editorMode": "code",
955
+ "format": "table",
956
+ "rawQuery": true,
957
+ "rawSql": "SELECT CAST(m.measurements->'time_to_next_token_mean_secs' AS double precision) AS time_to_next_token_mean_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
958
+ "refId": "A",
959
+ "sql": {
960
+ "columns": [
961
+ {
962
+ "parameters": [],
963
+ "type": "function"
964
+ }
965
+ ],
966
+ "groupBy": [
967
+ {
968
+ "property": {
969
+ "type": "string"
970
+ },
971
+ "type": "groupBy"
972
+ }
973
+ ],
974
+ "limit": 50
975
+ }
976
+ }
977
+ ],
978
+ "title": "Time to subsequent next tokens mean",
979
+ "transformations": [
980
+ {
981
+ "id": "sortBy",
982
+ "options": {
983
+ "fields": {},
984
+ "sort": [
985
+ {
986
+ "field": "time"
987
+ }
988
+ ]
989
+ }
990
+ }
991
+ ],
992
+ "transparent": true,
993
+ "type": "barchart"
994
+ },
995
+ {
996
+ "collapsed": false,
997
+ "gridPos": {
998
+ "h": 1,
999
+ "w": 24,
1000
+ "x": 0,
1001
+ "y": 41
1002
+ },
1003
+ "id": 14,
1004
+ "panels": [],
1005
+ "title": "Compiled Generate",
1006
+ "type": "row"
1007
+ },
1008
+ {
1009
+ "datasource": {
1010
+ "default": true,
1011
+ "type": "grafana-postgresql-datasource",
1012
+ "uid": "be28nkzirtb0gd"
1013
+ },
1014
+ "fieldConfig": {
1015
+ "defaults": {
1016
+ "color": {
1017
+ "mode": "continuous-YlBl"
1018
+ },
1019
+ "custom": {
1020
+ "axisBorderShow": false,
1021
+ "axisCenteredZero": false,
1022
+ "axisColorMode": "text",
1023
+ "axisLabel": "",
1024
+ "axisPlacement": "auto",
1025
+ "fillOpacity": 80,
1026
+ "gradientMode": "scheme",
1027
+ "hideFrom": {
1028
+ "legend": false,
1029
+ "tooltip": false,
1030
+ "viz": false
1031
+ },
1032
+ "lineWidth": 0,
1033
+ "scaleDistribution": {
1034
+ "type": "linear"
1035
+ },
1036
+ "thresholdsStyle": {
1037
+ "mode": "off"
1038
+ }
1039
+ },
1040
+ "mappings": [],
1041
+ "thresholds": {
1042
+ "mode": "absolute",
1043
+ "steps": [
1044
+ {
1045
+ "color": "green",
1046
+ "value": null
1047
+ }
1048
+ ]
1049
+ },
1050
+ "unit": "s"
1051
+ },
1052
+ "overrides": []
1053
+ },
1054
+ "gridPos": {
1055
+ "h": 11,
1056
+ "w": 12,
1057
+ "x": 0,
1058
+ "y": 42
1059
+ },
1060
+ "id": 8,
1061
+ "options": {
1062
+ "barRadius": 0.05,
1063
+ "barWidth": 0.8,
1064
+ "fullHighlight": false,
1065
+ "groupWidth": 0.7,
1066
+ "legend": {
1067
+ "calcs": [],
1068
+ "displayMode": "list",
1069
+ "placement": "bottom",
1070
+ "showLegend": false
1071
+ },
1072
+ "orientation": "auto",
1073
+ "showValue": "always",
1074
+ "stacking": "none",
1075
+ "tooltip": {
1076
+ "mode": "single",
1077
+ "sort": "none"
1078
+ },
1079
+ "xTickLabelRotation": 0,
1080
+ "xTickLabelSpacing": 0
1081
+ },
1082
+ "targets": [
1083
+ {
1084
+ "datasource": {
1085
+ "default": true,
1086
+ "type": "grafana-postgresql-datasource",
1087
+ "uid": "be28nkzirtb0gd"
1088
+ },
1089
+ "editorMode": "code",
1090
+ "format": "table",
1091
+ "rawQuery": true,
1092
+ "rawSql": "SELECT CAST(m.measurements->'first_compile_generate_time_secs' AS double precision) AS first_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
1093
+ "refId": "A",
1094
+ "sql": {
1095
+ "columns": [
1096
+ {
1097
+ "parameters": [],
1098
+ "type": "function"
1099
+ }
1100
+ ],
1101
+ "groupBy": [
1102
+ {
1103
+ "property": {
1104
+ "type": "string"
1105
+ },
1106
+ "type": "groupBy"
1107
+ }
1108
+ ],
1109
+ "limit": 50
1110
+ }
1111
+ }
1112
+ ],
1113
+ "title": "First compile generate",
1114
+ "transformations": [
1115
+ {
1116
+ "id": "sortBy",
1117
+ "options": {
1118
+ "fields": {},
1119
+ "sort": [
1120
+ {
1121
+ "field": "time"
1122
+ }
1123
+ ]
1124
+ }
1125
+ }
1126
+ ],
1127
+ "transparent": true,
1128
+ "type": "barchart"
1129
+ },
1130
+ {
1131
+ "datasource": {
1132
+ "default": true,
1133
+ "type": "grafana-postgresql-datasource",
1134
+ "uid": "be28nkzirtb0gd"
1135
+ },
1136
+ "fieldConfig": {
1137
+ "defaults": {
1138
+ "color": {
1139
+ "mode": "continuous-YlBl"
1140
+ },
1141
+ "custom": {
1142
+ "axisBorderShow": false,
1143
+ "axisCenteredZero": false,
1144
+ "axisColorMode": "text",
1145
+ "axisLabel": "",
1146
+ "axisPlacement": "auto",
1147
+ "fillOpacity": 80,
1148
+ "gradientMode": "scheme",
1149
+ "hideFrom": {
1150
+ "legend": false,
1151
+ "tooltip": false,
1152
+ "viz": false
1153
+ },
1154
+ "lineWidth": 0,
1155
+ "scaleDistribution": {
1156
+ "type": "linear"
1157
+ },
1158
+ "thresholdsStyle": {
1159
+ "mode": "off"
1160
+ }
1161
+ },
1162
+ "mappings": [],
1163
+ "thresholds": {
1164
+ "mode": "absolute",
1165
+ "steps": [
1166
+ {
1167
+ "color": "green",
1168
+ "value": null
1169
+ }
1170
+ ]
1171
+ },
1172
+ "unit": "s"
1173
+ },
1174
+ "overrides": []
1175
+ },
1176
+ "gridPos": {
1177
+ "h": 11,
1178
+ "w": 12,
1179
+ "x": 12,
1180
+ "y": 42
1181
+ },
1182
+ "id": 10,
1183
+ "options": {
1184
+ "barRadius": 0.05,
1185
+ "barWidth": 0.8,
1186
+ "fullHighlight": false,
1187
+ "groupWidth": 0.7,
1188
+ "legend": {
1189
+ "calcs": [],
1190
+ "displayMode": "list",
1191
+ "placement": "bottom",
1192
+ "showLegend": false
1193
+ },
1194
+ "orientation": "auto",
1195
+ "showValue": "auto",
1196
+ "stacking": "none",
1197
+ "tooltip": {
1198
+ "mode": "single",
1199
+ "sort": "none"
1200
+ },
1201
+ "xTickLabelRotation": 0,
1202
+ "xTickLabelSpacing": 0
1203
+ },
1204
+ "targets": [
1205
+ {
1206
+ "datasource": {
1207
+ "default": true,
1208
+ "type": "grafana-postgresql-datasource",
1209
+ "uid": "be28nkzirtb0gd"
1210
+ },
1211
+ "editorMode": "code",
1212
+ "format": "table",
1213
+ "rawQuery": true,
1214
+ "rawSql": "SELECT CAST(m.measurements->'second_compile_generate_time_secs' AS double precision) AS second_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
1215
+ "refId": "A",
1216
+ "sql": {
1217
+ "columns": [
1218
+ {
1219
+ "parameters": [],
1220
+ "type": "function"
1221
+ }
1222
+ ],
1223
+ "groupBy": [
1224
+ {
1225
+ "property": {
1226
+ "type": "string"
1227
+ },
1228
+ "type": "groupBy"
1229
+ }
1230
+ ],
1231
+ "limit": 50
1232
+ }
1233
+ }
1234
+ ],
1235
+ "title": "Second compile generate",
1236
+ "transformations": [
1237
+ {
1238
+ "id": "sortBy",
1239
+ "options": {
1240
+ "fields": {},
1241
+ "sort": [
1242
+ {
1243
+ "field": "time"
1244
+ }
1245
+ ]
1246
+ }
1247
+ }
1248
+ ],
1249
+ "transparent": true,
1250
+ "type": "barchart"
1251
+ },
1252
+ {
1253
+ "datasource": {
1254
+ "default": true,
1255
+ "type": "grafana-postgresql-datasource",
1256
+ "uid": "be28nkzirtb0gd"
1257
+ },
1258
+ "fieldConfig": {
1259
+ "defaults": {
1260
+ "color": {
1261
+ "mode": "continuous-YlBl"
1262
+ },
1263
+ "custom": {
1264
+ "axisBorderShow": false,
1265
+ "axisCenteredZero": false,
1266
+ "axisColorMode": "text",
1267
+ "axisLabel": "",
1268
+ "axisPlacement": "auto",
1269
+ "fillOpacity": 80,
1270
+ "gradientMode": "scheme",
1271
+ "hideFrom": {
1272
+ "legend": false,
1273
+ "tooltip": false,
1274
+ "viz": false
1275
+ },
1276
+ "lineWidth": 0,
1277
+ "scaleDistribution": {
1278
+ "type": "linear"
1279
+ },
1280
+ "thresholdsStyle": {
1281
+ "mode": "off"
1282
+ }
1283
+ },
1284
+ "mappings": [],
1285
+ "thresholds": {
1286
+ "mode": "absolute",
1287
+ "steps": [
1288
+ {
1289
+ "color": "green",
1290
+ "value": null
1291
+ }
1292
+ ]
1293
+ },
1294
+ "unit": "s"
1295
+ },
1296
+ "overrides": []
1297
+ },
1298
+ "gridPos": {
1299
+ "h": 11,
1300
+ "w": 12,
1301
+ "x": 0,
1302
+ "y": 53
1303
+ },
1304
+ "id": 11,
1305
+ "options": {
1306
+ "barRadius": 0.05,
1307
+ "barWidth": 0.8,
1308
+ "fullHighlight": false,
1309
+ "groupWidth": 0.7,
1310
+ "legend": {
1311
+ "calcs": [],
1312
+ "displayMode": "list",
1313
+ "placement": "bottom",
1314
+ "showLegend": false
1315
+ },
1316
+ "orientation": "auto",
1317
+ "showValue": "auto",
1318
+ "stacking": "none",
1319
+ "tooltip": {
1320
+ "mode": "single",
1321
+ "sort": "none"
1322
+ },
1323
+ "xTickLabelRotation": 0,
1324
+ "xTickLabelSpacing": 0
1325
+ },
1326
+ "targets": [
1327
+ {
1328
+ "datasource": {
1329
+ "default": true,
1330
+ "type": "grafana-postgresql-datasource",
1331
+ "uid": "be28nkzirtb0gd"
1332
+ },
1333
+ "editorMode": "code",
1334
+ "format": "table",
1335
+ "rawQuery": true,
1336
+ "rawSql": "SELECT CAST(m.measurements->'third_compile_generate_time_secs' AS double precision) AS third_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
1337
+ "refId": "A",
1338
+ "sql": {
1339
+ "columns": [
1340
+ {
1341
+ "parameters": [],
1342
+ "type": "function"
1343
+ }
1344
+ ],
1345
+ "groupBy": [
1346
+ {
1347
+ "property": {
1348
+ "type": "string"
1349
+ },
1350
+ "type": "groupBy"
1351
+ }
1352
+ ],
1353
+ "limit": 50
1354
+ }
1355
+ }
1356
+ ],
1357
+ "title": "Third compile generate",
1358
+ "transformations": [
1359
+ {
1360
+ "id": "sortBy",
1361
+ "options": {
1362
+ "fields": {},
1363
+ "sort": [
1364
+ {
1365
+ "field": "time"
1366
+ }
1367
+ ]
1368
+ }
1369
+ }
1370
+ ],
1371
+ "transparent": true,
1372
+ "type": "barchart"
1373
+ },
1374
+ {
1375
+ "datasource": {
1376
+ "default": true,
1377
+ "type": "grafana-postgresql-datasource",
1378
+ "uid": "be28nkzirtb0gd"
1379
+ },
1380
+ "fieldConfig": {
1381
+ "defaults": {
1382
+ "color": {
1383
+ "mode": "continuous-YlBl"
1384
+ },
1385
+ "custom": {
1386
+ "axisBorderShow": false,
1387
+ "axisCenteredZero": false,
1388
+ "axisColorMode": "text",
1389
+ "axisLabel": "",
1390
+ "axisPlacement": "auto",
1391
+ "fillOpacity": 80,
1392
+ "gradientMode": "none",
1393
+ "hideFrom": {
1394
+ "legend": false,
1395
+ "tooltip": false,
1396
+ "viz": false
1397
+ },
1398
+ "lineWidth": 0,
1399
+ "scaleDistribution": {
1400
+ "type": "linear"
1401
+ },
1402
+ "thresholdsStyle": {
1403
+ "mode": "off"
1404
+ }
1405
+ },
1406
+ "mappings": [],
1407
+ "thresholds": {
1408
+ "mode": "absolute",
1409
+ "steps": [
1410
+ {
1411
+ "color": "green",
1412
+ "value": null
1413
+ }
1414
+ ]
1415
+ },
1416
+ "unit": "s"
1417
+ },
1418
+ "overrides": []
1419
+ },
1420
+ "gridPos": {
1421
+ "h": 11,
1422
+ "w": 12,
1423
+ "x": 12,
1424
+ "y": 53
1425
+ },
1426
+ "id": 12,
1427
+ "options": {
1428
+ "barRadius": 0.05,
1429
+ "barWidth": 0.8,
1430
+ "fullHighlight": false,
1431
+ "groupWidth": 0.7,
1432
+ "legend": {
1433
+ "calcs": [],
1434
+ "displayMode": "list",
1435
+ "placement": "bottom",
1436
+ "showLegend": false
1437
+ },
1438
+ "orientation": "auto",
1439
+ "showValue": "auto",
1440
+ "stacking": "none",
1441
+ "tooltip": {
1442
+ "mode": "single",
1443
+ "sort": "none"
1444
+ },
1445
+ "xTickLabelRotation": 0,
1446
+ "xTickLabelSpacing": 0
1447
+ },
1448
+ "targets": [
1449
+ {
1450
+ "datasource": {
1451
+ "default": true,
1452
+ "type": "grafana-postgresql-datasource",
1453
+ "uid": "be28nkzirtb0gd"
1454
+ },
1455
+ "editorMode": "code",
1456
+ "format": "table",
1457
+ "rawQuery": true,
1458
+ "rawSql": "SELECT CAST(m.measurements->'fourth_compile_generate_time_secs' AS double precision) AS fourth_compile_generate_time_secs, left(b.commit_id, 7), m.time FROM benchmarks as b JOIN model_measurements AS m ON b.benchmark_id = m.benchmark_id WHERE b.branch = '${branch}' AND b.metadata->>'gpu_name' = '${gpu_name}' ORDER BY b.benchmark_id DESC LIMIT ${last_n_commits};",
1459
+ "refId": "A",
1460
+ "sql": {
1461
+ "columns": [
1462
+ {
1463
+ "parameters": [],
1464
+ "type": "function"
1465
+ }
1466
+ ],
1467
+ "groupBy": [
1468
+ {
1469
+ "property": {
1470
+ "type": "string"
1471
+ },
1472
+ "type": "groupBy"
1473
+ }
1474
+ ],
1475
+ "limit": 50
1476
+ }
1477
+ }
1478
+ ],
1479
+ "title": "Fourth compile generate",
1480
+ "transformations": [
1481
+ {
1482
+ "id": "sortBy",
1483
+ "options": {
1484
+ "fields": {},
1485
+ "sort": [
1486
+ {
1487
+ "field": "time"
1488
+ }
1489
+ ]
1490
+ }
1491
+ }
1492
+ ],
1493
+ "transparent": true,
1494
+ "type": "barchart"
1495
+ },
1496
+ {
1497
+ "collapsed": true,
1498
+ "gridPos": {
1499
+ "h": 1,
1500
+ "w": 24,
1501
+ "x": 0,
1502
+ "y": 64
1503
+ },
1504
+ "id": 15,
1505
+ "panels": [
1506
+ {
1507
+ "datasource": {},
1508
+ "fieldConfig": {
1509
+ "defaults": {
1510
+ "color": {
1511
+ "mode": "palette-classic"
1512
+ },
1513
+ "custom": {
1514
+ "axisBorderShow": false,
1515
+ "axisCenteredZero": false,
1516
+ "axisColorMode": "text",
1517
+ "axisLabel": "",
1518
+ "axisPlacement": "auto",
1519
+ "barAlignment": 0,
1520
+ "barWidthFactor": 0.6,
1521
+ "drawStyle": "line",
1522
+ "fillOpacity": 0,
1523
+ "gradientMode": "none",
1524
+ "hideFrom": {
1525
+ "legend": false,
1526
+ "tooltip": false,
1527
+ "viz": false
1528
+ },
1529
+ "insertNulls": 60000,
1530
+ "lineInterpolation": "linear",
1531
+ "lineWidth": 1,
1532
+ "pointSize": 5,
1533
+ "scaleDistribution": {
1534
+ "type": "linear"
1535
+ },
1536
+ "showPoints": "auto",
1537
+ "spanNulls": false,
1538
+ "stacking": {
1539
+ "group": "A",
1540
+ "mode": "none"
1541
+ },
1542
+ "thresholdsStyle": {
1543
+ "mode": "off"
1544
+ }
1545
+ },
1546
+ "mappings": [],
1547
+ "thresholds": {
1548
+ "mode": "absolute",
1549
+ "steps": [
1550
+ {
1551
+ "color": "green"
1552
+ },
1553
+ {
1554
+ "color": "red",
1555
+ "value": 80
1556
+ }
1557
+ ]
1558
+ },
1559
+ "unit": "percent"
1560
+ },
1561
+ "overrides": []
1562
+ },
1563
+ "gridPos": {
1564
+ "h": 9,
1565
+ "w": 12,
1566
+ "x": 0,
1567
+ "y": 65
1568
+ },
1569
+ "id": 1,
1570
+ "options": {
1571
+ "legend": {
1572
+ "calcs": [],
1573
+ "displayMode": "list",
1574
+ "placement": "bottom",
1575
+ "showLegend": true
1576
+ },
1577
+ "tooltip": {
1578
+ "mode": "single",
1579
+ "sort": "none"
1580
+ }
1581
+ },
1582
+ "targets": [
1583
+ {
1584
+ "datasource": {
1585
+ "default": true,
1586
+ "type": "grafana-postgresql-datasource",
1587
+ "uid": "be28nkzirtb0gd"
1588
+ },
1589
+ "editorMode": "code",
1590
+ "format": "table",
1591
+ "rawQuery": true,
1592
+ "rawSql": "SELECT\n d.cpu_util,\n d.time\nFROM\n benchmarks AS b\n JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id\nWHERE\n branch = '${branch}';",
1593
+ "refId": "A",
1594
+ "sql": {
1595
+ "columns": [
1596
+ {
1597
+ "parameters": [
1598
+ {
1599
+ "name": "cpu_util",
1600
+ "type": "functionParameter"
1601
+ }
1602
+ ],
1603
+ "type": "function"
1604
+ },
1605
+ {
1606
+ "parameters": [
1607
+ {
1608
+ "name": "mem_megabytes",
1609
+ "type": "functionParameter"
1610
+ }
1611
+ ],
1612
+ "type": "function"
1613
+ },
1614
+ {
1615
+ "parameters": [
1616
+ {
1617
+ "name": "gpu_util",
1618
+ "type": "functionParameter"
1619
+ }
1620
+ ],
1621
+ "type": "function"
1622
+ },
1623
+ {
1624
+ "parameters": [
1625
+ {
1626
+ "name": "gpu_mem_megabytes",
1627
+ "type": "functionParameter"
1628
+ }
1629
+ ],
1630
+ "type": "function"
1631
+ },
1632
+ {
1633
+ "parameters": [
1634
+ {
1635
+ "name": "\"time\"",
1636
+ "type": "functionParameter"
1637
+ }
1638
+ ],
1639
+ "type": "function"
1640
+ }
1641
+ ],
1642
+ "groupBy": [
1643
+ {
1644
+ "property": {
1645
+ "type": "string"
1646
+ },
1647
+ "type": "groupBy"
1648
+ }
1649
+ ],
1650
+ "limit": 50,
1651
+ "whereJsonTree": {
1652
+ "children1": [
1653
+ {
1654
+ "id": "baa888b8-89ab-4cde-b012-31922f8671e9",
1655
+ "properties": {
1656
+ "field": "commit_id",
1657
+ "fieldSrc": "field",
1658
+ "operator": "equal",
1659
+ "value": [
1660
+ "${commit}"
1661
+ ],
1662
+ "valueError": [
1663
+ null
1664
+ ],
1665
+ "valueSrc": [
1666
+ "value"
1667
+ ],
1668
+ "valueType": [
1669
+ "text"
1670
+ ]
1671
+ },
1672
+ "type": "rule"
1673
+ }
1674
+ ],
1675
+ "id": "bab88a98-0123-4456-b89a-b1922f7d4f11",
1676
+ "type": "group"
1677
+ },
1678
+ "whereString": "commit_id = '${commit}'"
1679
+ },
1680
+ "table": "measurements"
1681
+ }
1682
+ ],
1683
+ "title": "CPU Utilization",
1684
+ "transparent": true,
1685
+ "type": "timeseries"
1686
+ },
1687
+ {
1688
+ "datasource": {},
1689
+ "fieldConfig": {
1690
+ "defaults": {
1691
+ "color": {
1692
+ "mode": "palette-classic"
1693
+ },
1694
+ "custom": {
1695
+ "axisBorderShow": false,
1696
+ "axisCenteredZero": false,
1697
+ "axisColorMode": "text",
1698
+ "axisLabel": "",
1699
+ "axisPlacement": "auto",
1700
+ "barAlignment": 0,
1701
+ "barWidthFactor": 0.6,
1702
+ "drawStyle": "line",
1703
+ "fillOpacity": 0,
1704
+ "gradientMode": "none",
1705
+ "hideFrom": {
1706
+ "legend": false,
1707
+ "tooltip": false,
1708
+ "viz": false
1709
+ },
1710
+ "insertNulls": 60000,
1711
+ "lineInterpolation": "linear",
1712
+ "lineWidth": 1,
1713
+ "pointSize": 5,
1714
+ "scaleDistribution": {
1715
+ "type": "linear"
1716
+ },
1717
+ "showPoints": "auto",
1718
+ "spanNulls": false,
1719
+ "stacking": {
1720
+ "group": "A",
1721
+ "mode": "none"
1722
+ },
1723
+ "thresholdsStyle": {
1724
+ "mode": "off"
1725
+ }
1726
+ },
1727
+ "mappings": [],
1728
+ "thresholds": {
1729
+ "mode": "absolute",
1730
+ "steps": [
1731
+ {
1732
+ "color": "green"
1733
+ },
1734
+ {
1735
+ "color": "red",
1736
+ "value": 80
1737
+ }
1738
+ ]
1739
+ },
1740
+ "unit": "percent"
1741
+ },
1742
+ "overrides": []
1743
+ },
1744
+ "gridPos": {
1745
+ "h": 9,
1746
+ "w": 12,
1747
+ "x": 12,
1748
+ "y": 65
1749
+ },
1750
+ "id": 4,
1751
+ "options": {
1752
+ "legend": {
1753
+ "calcs": [],
1754
+ "displayMode": "list",
1755
+ "placement": "bottom",
1756
+ "showLegend": true
1757
+ },
1758
+ "tooltip": {
1759
+ "mode": "single",
1760
+ "sort": "none"
1761
+ }
1762
+ },
1763
+ "targets": [
1764
+ {
1765
+ "datasource": {
1766
+ "default": true,
1767
+ "type": "grafana-postgresql-datasource",
1768
+ "uid": "be28nkzirtb0gd"
1769
+ },
1770
+ "editorMode": "code",
1771
+ "format": "table",
1772
+ "rawQuery": true,
1773
+ "rawSql": "SELECT\n b.commit_id,\n d.gpu_util,\n d.time\nFROM\n benchmarks AS b\n JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id\nWHERE\n branch = '${branch}';",
1774
+ "refId": "A",
1775
+ "sql": {
1776
+ "columns": [
1777
+ {
1778
+ "parameters": [
1779
+ {
1780
+ "name": "cpu_util",
1781
+ "type": "functionParameter"
1782
+ }
1783
+ ],
1784
+ "type": "function"
1785
+ },
1786
+ {
1787
+ "parameters": [
1788
+ {
1789
+ "name": "mem_megabytes",
1790
+ "type": "functionParameter"
1791
+ }
1792
+ ],
1793
+ "type": "function"
1794
+ },
1795
+ {
1796
+ "parameters": [
1797
+ {
1798
+ "name": "gpu_util",
1799
+ "type": "functionParameter"
1800
+ }
1801
+ ],
1802
+ "type": "function"
1803
+ },
1804
+ {
1805
+ "parameters": [
1806
+ {
1807
+ "name": "gpu_mem_megabytes",
1808
+ "type": "functionParameter"
1809
+ }
1810
+ ],
1811
+ "type": "function"
1812
+ },
1813
+ {
1814
+ "parameters": [
1815
+ {
1816
+ "name": "\"time\"",
1817
+ "type": "functionParameter"
1818
+ }
1819
+ ],
1820
+ "type": "function"
1821
+ }
1822
+ ],
1823
+ "groupBy": [
1824
+ {
1825
+ "property": {
1826
+ "type": "string"
1827
+ },
1828
+ "type": "groupBy"
1829
+ }
1830
+ ],
1831
+ "limit": 50,
1832
+ "whereJsonTree": {
1833
+ "children1": [
1834
+ {
1835
+ "id": "baa888b8-89ab-4cde-b012-31922f8671e9",
1836
+ "properties": {
1837
+ "field": "commit_id",
1838
+ "fieldSrc": "field",
1839
+ "operator": "equal",
1840
+ "value": [
1841
+ "${commit}"
1842
+ ],
1843
+ "valueError": [
1844
+ null
1845
+ ],
1846
+ "valueSrc": [
1847
+ "value"
1848
+ ],
1849
+ "valueType": [
1850
+ "text"
1851
+ ]
1852
+ },
1853
+ "type": "rule"
1854
+ }
1855
+ ],
1856
+ "id": "bab88a98-0123-4456-b89a-b1922f7d4f11",
1857
+ "type": "group"
1858
+ },
1859
+ "whereString": "commit_id = '${commit}'"
1860
+ },
1861
+ "table": "measurements"
1862
+ }
1863
+ ],
1864
+ "title": "GPU Utilization",
1865
+ "transparent": true,
1866
+ "type": "timeseries"
1867
+ },
1868
+ {
1869
+ "datasource": {},
1870
+ "fieldConfig": {
1871
+ "defaults": {
1872
+ "color": {
1873
+ "mode": "palette-classic"
1874
+ },
1875
+ "custom": {
1876
+ "axisBorderShow": false,
1877
+ "axisCenteredZero": false,
1878
+ "axisColorMode": "text",
1879
+ "axisLabel": "",
1880
+ "axisPlacement": "auto",
1881
+ "barAlignment": 0,
1882
+ "barWidthFactor": 0.6,
1883
+ "drawStyle": "line",
1884
+ "fillOpacity": 0,
1885
+ "gradientMode": "none",
1886
+ "hideFrom": {
1887
+ "legend": false,
1888
+ "tooltip": false,
1889
+ "viz": false
1890
+ },
1891
+ "insertNulls": 60000,
1892
+ "lineInterpolation": "linear",
1893
+ "lineWidth": 1,
1894
+ "pointSize": 5,
1895
+ "scaleDistribution": {
1896
+ "type": "linear"
1897
+ },
1898
+ "showPoints": "auto",
1899
+ "spanNulls": false,
1900
+ "stacking": {
1901
+ "group": "A",
1902
+ "mode": "none"
1903
+ },
1904
+ "thresholdsStyle": {
1905
+ "mode": "off"
1906
+ }
1907
+ },
1908
+ "mappings": [],
1909
+ "thresholds": {
1910
+ "mode": "absolute",
1911
+ "steps": [
1912
+ {
1913
+ "color": "green"
1914
+ },
1915
+ {
1916
+ "color": "red",
1917
+ "value": 80
1918
+ }
1919
+ ]
1920
+ },
1921
+ "unit": "decmbytes"
1922
+ },
1923
+ "overrides": []
1924
+ },
1925
+ "gridPos": {
1926
+ "h": 9,
1927
+ "w": 12,
1928
+ "x": 0,
1929
+ "y": 74
1930
+ },
1931
+ "id": 2,
1932
+ "options": {
1933
+ "legend": {
1934
+ "calcs": [],
1935
+ "displayMode": "list",
1936
+ "placement": "bottom",
1937
+ "showLegend": true
1938
+ },
1939
+ "tooltip": {
1940
+ "mode": "single",
1941
+ "sort": "none"
1942
+ }
1943
+ },
1944
+ "targets": [
1945
+ {
1946
+ "datasource": {
1947
+ "default": true,
1948
+ "type": "grafana-postgresql-datasource",
1949
+ "uid": "be28nkzirtb0gd"
1950
+ },
1951
+ "editorMode": "code",
1952
+ "format": "table",
1953
+ "rawQuery": true,
1954
+ "rawSql": "SELECT d.mem_megabytes, d.time FROM benchmarks AS b JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id WHERE branch = '${branch}';",
1955
+ "refId": "A",
1956
+ "sql": {
1957
+ "columns": [
1958
+ {
1959
+ "parameters": [
1960
+ {
1961
+ "name": "cpu_util",
1962
+ "type": "functionParameter"
1963
+ }
1964
+ ],
1965
+ "type": "function"
1966
+ },
1967
+ {
1968
+ "parameters": [
1969
+ {
1970
+ "name": "mem_megabytes",
1971
+ "type": "functionParameter"
1972
+ }
1973
+ ],
1974
+ "type": "function"
1975
+ },
1976
+ {
1977
+ "parameters": [
1978
+ {
1979
+ "name": "gpu_util",
1980
+ "type": "functionParameter"
1981
+ }
1982
+ ],
1983
+ "type": "function"
1984
+ },
1985
+ {
1986
+ "parameters": [
1987
+ {
1988
+ "name": "gpu_mem_megabytes",
1989
+ "type": "functionParameter"
1990
+ }
1991
+ ],
1992
+ "type": "function"
1993
+ },
1994
+ {
1995
+ "parameters": [
1996
+ {
1997
+ "name": "\"time\"",
1998
+ "type": "functionParameter"
1999
+ }
2000
+ ],
2001
+ "type": "function"
2002
+ }
2003
+ ],
2004
+ "groupBy": [
2005
+ {
2006
+ "property": {
2007
+ "type": "string"
2008
+ },
2009
+ "type": "groupBy"
2010
+ }
2011
+ ],
2012
+ "limit": 50,
2013
+ "whereJsonTree": {
2014
+ "children1": [
2015
+ {
2016
+ "id": "baa888b8-89ab-4cde-b012-31922f8671e9",
2017
+ "properties": {
2018
+ "field": "commit_id",
2019
+ "fieldSrc": "field",
2020
+ "operator": "equal",
2021
+ "value": [
2022
+ "${commit}"
2023
+ ],
2024
+ "valueError": [
2025
+ null
2026
+ ],
2027
+ "valueSrc": [
2028
+ "value"
2029
+ ],
2030
+ "valueType": [
2031
+ "text"
2032
+ ]
2033
+ },
2034
+ "type": "rule"
2035
+ }
2036
+ ],
2037
+ "id": "bab88a98-0123-4456-b89a-b1922f7d4f11",
2038
+ "type": "group"
2039
+ },
2040
+ "whereString": "commit_id = '${commit}'"
2041
+ },
2042
+ "table": "measurements"
2043
+ }
2044
+ ],
2045
+ "title": "Memory usage",
2046
+ "transparent": true,
2047
+ "type": "timeseries"
2048
+ },
2049
+ {
2050
+ "datasource": {},
2051
+ "fieldConfig": {
2052
+ "defaults": {
2053
+ "color": {
2054
+ "mode": "palette-classic"
2055
+ },
2056
+ "custom": {
2057
+ "axisBorderShow": false,
2058
+ "axisCenteredZero": false,
2059
+ "axisColorMode": "text",
2060
+ "axisLabel": "",
2061
+ "axisPlacement": "auto",
2062
+ "barAlignment": 0,
2063
+ "barWidthFactor": 0.6,
2064
+ "drawStyle": "line",
2065
+ "fillOpacity": 0,
2066
+ "gradientMode": "none",
2067
+ "hideFrom": {
2068
+ "legend": false,
2069
+ "tooltip": false,
2070
+ "viz": false
2071
+ },
2072
+ "insertNulls": 60000,
2073
+ "lineInterpolation": "linear",
2074
+ "lineWidth": 1,
2075
+ "pointSize": 5,
2076
+ "scaleDistribution": {
2077
+ "type": "linear"
2078
+ },
2079
+ "showPoints": "auto",
2080
+ "spanNulls": false,
2081
+ "stacking": {
2082
+ "group": "A",
2083
+ "mode": "none"
2084
+ },
2085
+ "thresholdsStyle": {
2086
+ "mode": "off"
2087
+ }
2088
+ },
2089
+ "mappings": [],
2090
+ "thresholds": {
2091
+ "mode": "absolute",
2092
+ "steps": [
2093
+ {
2094
+ "color": "green"
2095
+ },
2096
+ {
2097
+ "color": "red",
2098
+ "value": 80
2099
+ }
2100
+ ]
2101
+ },
2102
+ "unit": "decmbytes"
2103
+ },
2104
+ "overrides": []
2105
+ },
2106
+ "gridPos": {
2107
+ "h": 9,
2108
+ "w": 12,
2109
+ "x": 12,
2110
+ "y": 74
2111
+ },
2112
+ "id": 3,
2113
+ "options": {
2114
+ "legend": {
2115
+ "calcs": [],
2116
+ "displayMode": "list",
2117
+ "placement": "bottom",
2118
+ "showLegend": true
2119
+ },
2120
+ "tooltip": {
2121
+ "mode": "single",
2122
+ "sort": "none"
2123
+ }
2124
+ },
2125
+ "targets": [
2126
+ {
2127
+ "datasource": {
2128
+ "default": true,
2129
+ "type": "grafana-postgresql-datasource",
2130
+ "uid": "be28nkzirtb0gd"
2131
+ },
2132
+ "editorMode": "code",
2133
+ "format": "table",
2134
+ "rawQuery": true,
2135
+ "rawSql": "SELECT\n d.gpu_mem_megabytes,\n d.time\nFROM\n benchmarks AS b\n JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id\nWHERE\n branch = '${branch}';",
2136
+ "refId": "A",
2137
+ "sql": {
2138
+ "columns": [
2139
+ {
2140
+ "parameters": [
2141
+ {
2142
+ "name": "cpu_util",
2143
+ "type": "functionParameter"
2144
+ }
2145
+ ],
2146
+ "type": "function"
2147
+ },
2148
+ {
2149
+ "parameters": [
2150
+ {
2151
+ "name": "mem_megabytes",
2152
+ "type": "functionParameter"
2153
+ }
2154
+ ],
2155
+ "type": "function"
2156
+ },
2157
+ {
2158
+ "parameters": [
2159
+ {
2160
+ "name": "gpu_util",
2161
+ "type": "functionParameter"
2162
+ }
2163
+ ],
2164
+ "type": "function"
2165
+ },
2166
+ {
2167
+ "parameters": [
2168
+ {
2169
+ "name": "gpu_mem_megabytes",
2170
+ "type": "functionParameter"
2171
+ }
2172
+ ],
2173
+ "type": "function"
2174
+ },
2175
+ {
2176
+ "parameters": [
2177
+ {
2178
+ "name": "\"time\"",
2179
+ "type": "functionParameter"
2180
+ }
2181
+ ],
2182
+ "type": "function"
2183
+ }
2184
+ ],
2185
+ "groupBy": [
2186
+ {
2187
+ "property": {
2188
+ "type": "string"
2189
+ },
2190
+ "type": "groupBy"
2191
+ }
2192
+ ],
2193
+ "limit": 50,
2194
+ "whereJsonTree": {
2195
+ "children1": [
2196
+ {
2197
+ "id": "baa888b8-89ab-4cde-b012-31922f8671e9",
2198
+ "properties": {
2199
+ "field": "commit_id",
2200
+ "fieldSrc": "field",
2201
+ "operator": "equal",
2202
+ "value": [
2203
+ "${commit}"
2204
+ ],
2205
+ "valueError": [
2206
+ null
2207
+ ],
2208
+ "valueSrc": [
2209
+ "value"
2210
+ ],
2211
+ "valueType": [
2212
+ "text"
2213
+ ]
2214
+ },
2215
+ "type": "rule"
2216
+ }
2217
+ ],
2218
+ "id": "bab88a98-0123-4456-b89a-b1922f7d4f11",
2219
+ "type": "group"
2220
+ },
2221
+ "whereString": "commit_id = '${commit}'"
2222
+ },
2223
+ "table": "measurements"
2224
+ }
2225
+ ],
2226
+ "title": "GPU memory usage",
2227
+ "transparent": true,
2228
+ "type": "timeseries"
2229
+ }
2230
+ ],
2231
+ "title": "Usage metrics",
2232
+ "type": "row"
2233
+ }
2234
+ ],
2235
+ "schemaVersion": 39,
2236
+ "tags": [],
2237
+ "templating": {
2238
+ "list": [
2239
+ {
2240
+ "current": {
2241
+ "selected": false,
2242
+ "text": "main",
2243
+ "value": "main"
2244
+ },
2245
+ "datasource": {
2246
+ "default": true,
2247
+ "type": "grafana-postgresql-datasource",
2248
+ "uid": "be28nkzirtb0gd"
2249
+ },
2250
+ "definition": "SELECT DISTINCT branch FROM benchmarks;",
2251
+ "description": "",
2252
+ "hide": 0,
2253
+ "includeAll": false,
2254
+ "label": "branch",
2255
+ "multi": false,
2256
+ "name": "branch",
2257
+ "options": [],
2258
+ "query": "SELECT DISTINCT branch FROM benchmarks;",
2259
+ "refresh": 1,
2260
+ "regex": "",
2261
+ "skipUrlSync": false,
2262
+ "sort": 0,
2263
+ "type": "query"
2264
+ },
2265
+ {
2266
+ "current": {
2267
+ "selected": false,
2268
+ "text": "1729701492845",
2269
+ "value": "1729701492845"
2270
+ },
2271
+ "datasource": {
2272
+ "default": true,
2273
+ "type": "grafana-postgresql-datasource",
2274
+ "uid": "be28nkzirtb0gd"
2275
+ },
2276
+ "definition": "SELECT created_at - INTERVAL '5 secs' FROM benchmarks WHERE branch = '${branch}' ORDER BY benchmark_id ASC LIMIT 1;",
2277
+ "description": "",
2278
+ "hide": 2,
2279
+ "includeAll": false,
2280
+ "multi": false,
2281
+ "name": "StartTime",
2282
+ "options": [],
2283
+ "query": "SELECT created_at - INTERVAL '5 secs' FROM benchmarks WHERE branch = '${branch}' ORDER BY benchmark_id ASC LIMIT 1;",
2284
+ "refresh": 2,
2285
+ "regex": "",
2286
+ "skipUrlSync": false,
2287
+ "sort": 0,
2288
+ "type": "query"
2289
+ },
2290
+ {
2291
+ "current": {
2292
+ "selected": false,
2293
+ "text": "1730393397577",
2294
+ "value": "1730393397577"
2295
+ },
2296
+ "datasource": {
2297
+ "default": true,
2298
+ "type": "grafana-postgresql-datasource",
2299
+ "uid": "be28nkzirtb0gd"
2300
+ },
2301
+ "definition": "SELECT time + INTERVAL '5 secs' FROM benchmarks AS b JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id WHERE branch = '${branch}' ORDER BY b.benchmark_id DESC, d.measurement_id DESC LIMIT 1;",
2302
+ "description": "",
2303
+ "hide": 2,
2304
+ "includeAll": false,
2305
+ "multi": false,
2306
+ "name": "EndTime",
2307
+ "options": [],
2308
+ "query": "SELECT time + INTERVAL '5 secs' FROM benchmarks AS b JOIN device_measurements AS d ON b.benchmark_id = d.benchmark_id WHERE branch = '${branch}' ORDER BY b.benchmark_id DESC, d.measurement_id DESC LIMIT 1;",
2309
+ "refresh": 1,
2310
+ "regex": "",
2311
+ "skipUrlSync": false,
2312
+ "sort": 0,
2313
+ "type": "query"
2314
+ },
2315
+ {
2316
+ "current": {
2317
+ "selected": false,
2318
+ "text": "NVIDIA A10G",
2319
+ "value": "NVIDIA A10G"
2320
+ },
2321
+ "datasource": {
2322
+ "type": "grafana-postgresql-datasource",
2323
+ "uid": "be28nkzirtb0gd"
2324
+ },
2325
+ "definition": "SELECT DISTINCT metadata->>'gpu_name' FROM benchmarks;",
2326
+ "description": "",
2327
+ "hide": 0,
2328
+ "includeAll": false,
2329
+ "label": "GPU",
2330
+ "multi": false,
2331
+ "name": "gpu_name",
2332
+ "options": [],
2333
+ "query": "SELECT DISTINCT metadata->>'gpu_name' FROM benchmarks;",
2334
+ "refresh": 1,
2335
+ "regex": "",
2336
+ "skipUrlSync": false,
2337
+ "sort": 0,
2338
+ "type": "query"
2339
+ },
2340
+ {
2341
+ "current": {
2342
+ "selected": true,
2343
+ "text": "10",
2344
+ "value": "10"
2345
+ },
2346
+ "description": "The number of commits to display, going from most recent to the nth commit.",
2347
+ "hide": 0,
2348
+ "label": "Last # of commits",
2349
+ "name": "last_n_commits",
2350
+ "options": [
2351
+ {
2352
+ "selected": true,
2353
+ "text": "10",
2354
+ "value": "10"
2355
+ }
2356
+ ],
2357
+ "query": "10",
2358
+ "skipUrlSync": false,
2359
+ "type": "textbox"
2360
+ }
2361
+ ]
2362
+ },
2363
+ "time": {
2364
+ "from": "now-1h",
2365
+ "to": "now"
2366
+ },
2367
+ "timepicker": {
2368
+ "hidden": false
2369
+ },
2370
+ "timezone": "browser",
2371
+ "title": "Transformers benchmarks",
2372
+ "uid": "fdz33iyzln9c0a",
2373
+ "version": 10,
2374
+ "weekStart": ""
2375
+ }
docs/transformers/benchmark/grafana_datasource.yaml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ apiVersion: 1
2
+ datasources:
3
+ - name: grafana-postgresql-datasource
4
+ uid: be28nkzirtb0gd
5
+ type: postgres
6
+ url: $GRAFANA_POSTGRES_DATASOURCE_URL
7
+ user: $GRAFANA_POSTGRES_DATASOURCE_USER
8
+ secureJsonData:
9
+ password: $GRAFANA_POSTGRES_DATASOURCE_PWD
10
+ jsonData:
11
+ database: metrics
12
+ maxOpenConns: 100
13
+ maxIdleConns: 100
14
+ maxIdleConnsAuto: true
15
+ connMaxLifetime: 14400
16
+ postgresVersion: 1000
17
+ timescaledb: false
docs/transformers/benchmark/init_db.sql ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ CREATE TABLE IF NOT EXISTS benchmarks (
2
+ benchmark_id SERIAL PRIMARY KEY,
3
+ branch VARCHAR(255),
4
+ commit_id VARCHAR(72),
5
+ commit_message VARCHAR(70),
6
+ metadata jsonb,
7
+ created_at timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
8
+ );
9
+
10
+ CREATE INDEX IF NOT EXISTS benchmarks_benchmark_id_idx ON benchmarks (benchmark_id);
11
+
12
+ CREATE INDEX IF NOT EXISTS benchmarks_branch_idx ON benchmarks (branch);
13
+
14
+ CREATE TABLE IF NOT EXISTS device_measurements (
15
+ measurement_id SERIAL PRIMARY KEY,
16
+ benchmark_id int REFERENCES benchmarks (benchmark_id),
17
+ cpu_util double precision,
18
+ mem_megabytes double precision,
19
+ gpu_util double precision,
20
+ gpu_mem_megabytes double precision,
21
+ time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
22
+ );
23
+
24
+ CREATE INDEX IF NOT EXISTS device_measurements_branch_idx ON device_measurements (benchmark_id);
25
+
26
+ CREATE TABLE IF NOT EXISTS model_measurements (
27
+ measurement_id SERIAL PRIMARY KEY,
28
+ benchmark_id int REFERENCES benchmarks (benchmark_id),
29
+ measurements jsonb,
30
+ time timestamp without time zone NOT NULL DEFAULT (current_timestamp AT TIME ZONE 'UTC')
31
+ );
32
+
33
+ CREATE INDEX IF NOT EXISTS model_measurements_branch_idx ON model_measurements (benchmark_id);
docs/transformers/benchmark/llama.py ADDED
@@ -0,0 +1,342 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from logging import Logger
2
+ import os
3
+ from threading import Event, Thread
4
+ from time import perf_counter, sleep
5
+ from typing import Optional
6
+ from benchmarks_entrypoint import MetricsRecorder
7
+ import gpustat
8
+ import psutil
9
+ import psycopg2
10
+ import torch
11
+
12
+ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, StaticCache
13
+
14
+
15
+ os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
16
+
17
+ os.environ["TOKENIZERS_PARALLELISM"] = "1"
18
+ torch.set_float32_matmul_precision("high")
19
+
20
+
21
+ def collect_metrics(benchmark_id, continue_metric_collection, metrics_recorder):
22
+ p = psutil.Process(os.getpid())
23
+ while not continue_metric_collection.is_set():
24
+ with p.oneshot():
25
+ cpu_util = p.cpu_percent()
26
+ mem_megabytes = p.memory_info().rss / (1024 * 1024)
27
+ gpu_stats = gpustat.GPUStatCollection.new_query()
28
+ gpu_util = gpu_stats[0]["utilization.gpu"]
29
+ gpu_mem_megabytes = gpu_stats[0]["memory.used"]
30
+ metrics_recorder.collect_device_measurements(
31
+ benchmark_id, cpu_util, mem_megabytes, gpu_util, gpu_mem_megabytes
32
+ )
33
+ sleep(0.01)
34
+
35
+
36
+ def run_benchmark(logger: Logger, branch: str, commit_id: str, commit_msg: str, num_tokens_to_generate=100):
37
+ continue_metric_collection = Event()
38
+ metrics_thread = None
39
+ model_id = "meta-llama/Llama-2-7b-hf"
40
+ metrics_recorder = MetricsRecorder(psycopg2.connect("dbname=metrics"), logger, branch, commit_id, commit_msg)
41
+ try:
42
+ gpu_stats = gpustat.GPUStatCollection.new_query()
43
+ gpu_name = gpu_stats[0]["name"]
44
+ benchmark_id = metrics_recorder.initialise_benchmark({"gpu_name": gpu_name, "model_id": model_id})
45
+ logger.info(f"running benchmark #{benchmark_id} on {gpu_name} for {model_id}")
46
+ metrics_thread = Thread(
47
+ target=collect_metrics,
48
+ args=[benchmark_id, continue_metric_collection, metrics_recorder],
49
+ )
50
+ metrics_thread.start()
51
+ logger.info("started background thread to fetch device metrics")
52
+
53
+ os.environ["TOKENIZERS_PARALLELISM"] = "false" # silence warnings when compiling
54
+
55
+ device = "cuda"
56
+
57
+ logger.info("downloading weights")
58
+ # This is to avoid counting download in model load time measurement
59
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
60
+ gen_config = GenerationConfig(do_sample=False, top_p=1, temperature=1)
61
+ logger.info("loading model")
62
+ start = perf_counter()
63
+ model = AutoModelForCausalLM.from_pretrained(
64
+ model_id, torch_dtype=torch.float16, generation_config=gen_config
65
+ ).eval()
66
+ model.to(device)
67
+ torch.cuda.synchronize()
68
+ end = perf_counter()
69
+ model_load_time = end - start
70
+ logger.info(f"loaded model in: {model_load_time}s")
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
73
+
74
+ prompt = "Why dogs are so cute?"
75
+ inputs = tokenizer(prompt, return_tensors="pt").to(device)
76
+
77
+ # Specify the max length (including both the prompt and the response)
78
+ # When calling `generate` with `cache_implementation="static" later, this is also used to create a `StaticCache` object
79
+ # with sequence length = `max_length`. The longer the more you will re-use it
80
+ seq_length = inputs["input_ids"].shape[1]
81
+ model.generation_config.max_length = seq_length + num_tokens_to_generate
82
+ batch_size = inputs["input_ids"].shape[0]
83
+
84
+ # Copied from the gpt-fast repo
85
+ def multinomial_sample_one_no_sync(probs_sort): # Does multinomial sampling without a cuda synchronization
86
+ q = torch.empty_like(probs_sort).exponential_(1)
87
+ return torch.argmax(probs_sort / q, dim=-1, keepdim=True).to(dtype=torch.int)
88
+
89
+ def logits_to_probs(logits, temperature: float = 1.0, top_k: Optional[int] = None):
90
+ logits = logits / max(temperature, 1e-5)
91
+
92
+ if top_k is not None:
93
+ v, _ = torch.topk(logits, min(top_k, logits.size(-1)))
94
+ pivot = v.select(-1, -1).unsqueeze(-1)
95
+ logits = torch.where(logits < pivot, -float("Inf"), logits)
96
+ probs = torch.nn.functional.softmax(logits, dim=-1)
97
+ return probs
98
+
99
+ def sample(logits, temperature: float = 1.0, top_k: Optional[int] = None):
100
+ probs = logits_to_probs(logits[:, -1], temperature, top_k)
101
+ idx_next = multinomial_sample_one_no_sync(probs)
102
+ return idx_next, probs
103
+
104
+ def decode_one_token(model, cur_token, cache_position, past_key_values):
105
+ logits = model(
106
+ cur_token,
107
+ cache_position=cache_position,
108
+ past_key_values=past_key_values,
109
+ return_dict=False,
110
+ use_cache=True,
111
+ )[0]
112
+ new_token = sample(logits, temperature=0.6, top_k=5)[0]
113
+ return new_token
114
+
115
+ #########
116
+ # Eager #
117
+ #########
118
+ with torch.no_grad():
119
+ past_key_values = StaticCache(
120
+ model.config,
121
+ max_batch_size=batch_size,
122
+ device=device,
123
+ dtype=torch.float16,
124
+ max_cache_len=seq_length + num_tokens_to_generate,
125
+ )
126
+ cache_position = torch.arange(seq_length, device=device)
127
+ start = perf_counter()
128
+ model(
129
+ **inputs,
130
+ cache_position=cache_position,
131
+ past_key_values=past_key_values,
132
+ return_dict=False,
133
+ use_cache=True,
134
+ )
135
+ end = perf_counter()
136
+ first_eager_fwd_pass_time = end - start
137
+ logger.info(f"completed first eager fwd pass in: {first_eager_fwd_pass_time}s")
138
+ start = perf_counter()
139
+ output = model.generate(**inputs, do_sample=False)
140
+ end = perf_counter()
141
+ first_eager_generate_time = end - start
142
+ logger.info(f"completed first eager generation in: {first_eager_generate_time}s")
143
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
144
+
145
+ past_key_values = StaticCache(
146
+ model.config,
147
+ max_batch_size=batch_size,
148
+ device=device,
149
+ dtype=torch.float16,
150
+ max_cache_len=seq_length + num_tokens_to_generate,
151
+ )
152
+ cache_position = torch.arange(seq_length, device=device)
153
+ start = perf_counter()
154
+ model(
155
+ **inputs,
156
+ cache_position=cache_position,
157
+ past_key_values=past_key_values,
158
+ return_dict=False,
159
+ use_cache=True,
160
+ )
161
+ end = perf_counter()
162
+ second_eager_fwd_pass_time = end - start
163
+ logger.info(f"completed second eager fwd pass in: {second_eager_fwd_pass_time}s")
164
+ start = perf_counter()
165
+ model.generate(**inputs, do_sample=False)
166
+ end = perf_counter()
167
+ second_eager_generate_time = end - start
168
+ logger.info(f"completed second eager generation in: {second_eager_generate_time}s")
169
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
170
+
171
+ torch.compiler.reset()
172
+
173
+ ################
174
+ # Forward pass #
175
+ ################
176
+
177
+ # `torch.compile(model, ...)` is not recommended as you compile callbacks
178
+ # and full generate. We recommend compiling only the forward for now.
179
+ # "reduce-overhead" will use cudagraphs.
180
+ generated_ids = torch.zeros(
181
+ (batch_size, num_tokens_to_generate + seq_length), dtype=torch.int, device=device
182
+ )
183
+
184
+ generated_ids[:, :seq_length] = inputs["input_ids"]
185
+ decode_one_token = torch.compile(decode_one_token, mode="reduce-overhead", fullgraph=True)
186
+ # model.forward = torch.compile(model.forward, mode="reduce-overhead", fullgraph=True)
187
+ # TODO use decode_one_token(model, input_id.clone(), cache_position) for verification
188
+ past_key_values = StaticCache(
189
+ model.config,
190
+ max_batch_size=batch_size,
191
+ device=device,
192
+ dtype=torch.float16,
193
+ max_cache_len=seq_length + num_tokens_to_generate + 10,
194
+ )
195
+ cache_position = torch.arange(seq_length, device=device)
196
+ all_generated_tokens = []
197
+ ### First compile, prefill
198
+ start = perf_counter()
199
+ next_token = decode_one_token(
200
+ model, inputs["input_ids"], cache_position=cache_position, past_key_values=past_key_values
201
+ )
202
+ torch.cuda.synchronize()
203
+ end = perf_counter()
204
+ time_to_first_token = end - start
205
+ logger.info(f"completed first compile generation in: {time_to_first_token}s")
206
+ cache_position += 1
207
+ all_generated_tokens += next_token.tolist()
208
+
209
+ cache_position = torch.tensor([seq_length], device=device)
210
+ ### First compile, decoding
211
+ start = perf_counter()
212
+ next_token = decode_one_token(
213
+ model, next_token.clone(), cache_position=cache_position, past_key_values=past_key_values
214
+ )
215
+ torch.cuda.synchronize()
216
+ end = perf_counter()
217
+ time_to_second_token = end - start
218
+ logger.info(f"completed second compile generation in: {time_to_second_token}s")
219
+ cache_position += 1
220
+ all_generated_tokens += next_token.tolist()
221
+
222
+ ### Second compile, decoding
223
+ start = perf_counter()
224
+ next_token = decode_one_token(
225
+ model, next_token.clone(), cache_position=cache_position, past_key_values=past_key_values
226
+ )
227
+ torch.cuda.synchronize()
228
+ end = perf_counter()
229
+ time_to_third_token = end - start
230
+ logger.info(f"completed third compile forward in: {time_to_third_token}s")
231
+ cache_position += 1
232
+ all_generated_tokens += next_token.tolist()
233
+
234
+ ### Using cuda graphs decoding
235
+
236
+ start = perf_counter()
237
+ for _ in range(1, num_tokens_to_generate):
238
+ all_generated_tokens += next_token.tolist()
239
+ next_token = decode_one_token(
240
+ model, next_token.clone(), cache_position=cache_position, past_key_values=past_key_values
241
+ )
242
+ cache_position += 1
243
+ torch.cuda.synchronize()
244
+ end = perf_counter()
245
+ mean_time_to_next_token = (end - start) / num_tokens_to_generate
246
+ logger.info(f"completed next compile generation in: {mean_time_to_next_token}s")
247
+ logger.info(f"generated: {tokenizer.batch_decode(all_generated_tokens)}")
248
+
249
+ ####################
250
+ # Generate compile #
251
+ ####################
252
+ torch.compiler.reset()
253
+ # we will not compile full generate as it' s to intensive, tho we measure full forward!
254
+
255
+ past_key_values = StaticCache(
256
+ model.config,
257
+ max_batch_size=batch_size,
258
+ device=device,
259
+ dtype=torch.float16,
260
+ max_cache_len=seq_length + 128,
261
+ )
262
+
263
+ # 1st call
264
+ start = perf_counter()
265
+ output = model.generate(**inputs, past_key_values=past_key_values)
266
+ torch.cuda.synchronize()
267
+ end = perf_counter()
268
+ first_compile_generate_time = end - start
269
+ logger.info(f"completed first compile generation in: {first_compile_generate_time}s")
270
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
271
+
272
+ past_key_values = StaticCache(
273
+ model.config,
274
+ max_batch_size=batch_size,
275
+ device=device,
276
+ dtype=torch.float16,
277
+ max_cache_len=seq_length + 128,
278
+ )
279
+ # 2nd call
280
+ start = perf_counter()
281
+ output = model.generate(**inputs, past_key_values=past_key_values)
282
+ torch.cuda.synchronize()
283
+ end = perf_counter()
284
+ second_compile_generate_time = end - start
285
+ logger.info(f"completed second compile generation in: {second_compile_generate_time}s")
286
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
287
+
288
+ past_key_values = StaticCache(
289
+ model.config,
290
+ max_batch_size=batch_size,
291
+ device=device,
292
+ dtype=torch.float16,
293
+ max_cache_len=seq_length + 128,
294
+ )
295
+
296
+ # 3nd call
297
+ start = perf_counter()
298
+ output = model.generate(**inputs, past_key_values=past_key_values)
299
+ end = perf_counter()
300
+ third_compile_generate_time = end - start
301
+ logger.info(f"completed third compile generation in: {third_compile_generate_time}s")
302
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
303
+
304
+ past_key_values = StaticCache(
305
+ model.config,
306
+ max_batch_size=batch_size,
307
+ device=device,
308
+ dtype=torch.float16,
309
+ max_cache_len=seq_length + 128,
310
+ )
311
+ # 4th call
312
+ start = perf_counter()
313
+ output = model.generate(**inputs, past_key_values=past_key_values)
314
+ end = perf_counter()
315
+ fourth_compile_generate_time = end - start
316
+ logger.info(f"completed fourth compile generation in: {fourth_compile_generate_time}s")
317
+ logger.info(f"generated: {tokenizer.batch_decode(output.cpu().tolist())}")
318
+
319
+ metrics_recorder.collect_model_measurements(
320
+ benchmark_id,
321
+ {
322
+ "model_load_time": model_load_time,
323
+ "first_eager_forward_pass_time_secs": first_eager_fwd_pass_time,
324
+ "second_eager_forward_pass_time_secs": second_eager_fwd_pass_time,
325
+ "first_eager_generate_time_secs": first_eager_generate_time,
326
+ "second_eager_generate_time_secs": second_eager_generate_time,
327
+ "time_to_first_token_secs": time_to_first_token,
328
+ "time_to_second_token_secs": time_to_second_token,
329
+ "time_to_third_token_secs": time_to_third_token,
330
+ "time_to_next_token_mean_secs": mean_time_to_next_token,
331
+ "first_compile_generate_time_secs": first_compile_generate_time,
332
+ "second_compile_generate_time_secs": second_compile_generate_time,
333
+ "third_compile_generate_time_secs": third_compile_generate_time,
334
+ "fourth_compile_generate_time_secs": fourth_compile_generate_time,
335
+ },
336
+ )
337
+ except Exception as e:
338
+ logger.error(f"Caught exception: {e}")
339
+ continue_metric_collection.set()
340
+ if metrics_thread is not None:
341
+ metrics_thread.join()
342
+ metrics_recorder.close()