gpt
#97
by
xaibie
- opened
- README.md +2 -16
- generation_config.json +1 -2
README.md
CHANGED
|
@@ -13,7 +13,7 @@ tags:
|
|
| 13 |
<p align="center">
|
| 14 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 15 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 16 |
-
<a href="https://
|
| 17 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 18 |
</p>
|
| 19 |
|
|
@@ -38,7 +38,7 @@ Both models were trained on our [harmony response format](https://github.com/ope
|
|
| 38 |
* **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
|
| 39 |
* **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
|
| 40 |
* **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.
|
| 41 |
-
* **MXFP4 quantization:** The models
|
| 42 |
|
| 43 |
---
|
| 44 |
|
|
@@ -166,17 +166,3 @@ The gpt-oss models are excellent for:
|
|
| 166 |
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
|
| 167 |
|
| 168 |
This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 node.
|
| 169 |
-
|
| 170 |
-
# Citation
|
| 171 |
-
|
| 172 |
-
```bibtex
|
| 173 |
-
@misc{openai2025gptoss120bgptoss20bmodel,
|
| 174 |
-
title={gpt-oss-120b & gpt-oss-20b Model Card},
|
| 175 |
-
author={OpenAI},
|
| 176 |
-
year={2025},
|
| 177 |
-
eprint={2508.10925},
|
| 178 |
-
archivePrefix={arXiv},
|
| 179 |
-
primaryClass={cs.CL},
|
| 180 |
-
url={https://arxiv.org/abs/2508.10925},
|
| 181 |
-
}
|
| 182 |
-
```
|
|
|
|
| 13 |
<p align="center">
|
| 14 |
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
|
| 15 |
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
|
| 16 |
+
<a href="https://openai.com/index/gpt-oss-model-card"><strong>Model card</strong></a> ·
|
| 17 |
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
|
| 18 |
</p>
|
| 19 |
|
|
|
|
| 38 |
* **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
|
| 39 |
* **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
|
| 40 |
* **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.
|
| 41 |
+
* **Native MXFP4 quantization:** The models are trained with native MXFP4 precision for the MoE layer, making `gpt-oss-120b` run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the `gpt-oss-20b` model run within 16GB of memory.
|
| 42 |
|
| 43 |
---
|
| 44 |
|
|
|
|
| 166 |
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
|
| 167 |
|
| 168 |
This smaller model `gpt-oss-20b` can be fine-tuned on consumer hardware, whereas the larger [`gpt-oss-120b`](https://huggingface.co/openai/gpt-oss-120b) can be fine-tuned on a single H100 node.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
generation_config.json
CHANGED
|
@@ -3,8 +3,7 @@
|
|
| 3 |
"do_sample": true,
|
| 4 |
"eos_token_id": [
|
| 5 |
200002,
|
| 6 |
-
199999
|
| 7 |
-
200012
|
| 8 |
],
|
| 9 |
"pad_token_id": 199999,
|
| 10 |
"transformers_version": "4.55.0.dev0"
|
|
|
|
| 3 |
"do_sample": true,
|
| 4 |
"eos_token_id": [
|
| 5 |
200002,
|
| 6 |
+
199999
|
|
|
|
| 7 |
],
|
| 8 |
"pad_token_id": 199999,
|
| 9 |
"transformers_version": "4.55.0.dev0"
|