Improve model card with paper and project page links
Browse filesThis PR improves the model card by adding links to the paper and project page for better context and accessibility.
README.md
CHANGED
@@ -1,16 +1,17 @@
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
- pl
|
4 |
-
license: apache-2.0
|
5 |
library_name: transformers
|
|
|
|
|
6 |
tags:
|
7 |
- finetuned
|
8 |
- gguf
|
9 |
- 8bit
|
10 |
inference: false
|
11 |
-
pipeline_tag: text-generation
|
12 |
-
base_model: speakleash/Bielik-4.5B-v3.0-Instruct
|
13 |
---
|
|
|
14 |
<p align="center">
|
15 |
<img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/speakleash_cyfronet.png">
|
16 |
</p>
|
@@ -21,7 +22,9 @@ This model was obtained by quantizing the weights and activations of [Bielik-4.5
|
|
21 |
AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
|
22 |
Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
|
23 |
|
24 |
-
📚 Technical report: https://
|
|
|
|
|
25 |
|
26 |
FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
|
27 |
|
|
|
1 |
---
|
2 |
+
base_model: speakleash/Bielik-4.5B-v3.0-Instruct
|
3 |
language:
|
4 |
- pl
|
|
|
5 |
library_name: transformers
|
6 |
+
license: apache-2.0
|
7 |
+
pipeline_tag: text-generation
|
8 |
tags:
|
9 |
- finetuned
|
10 |
- gguf
|
11 |
- 8bit
|
12 |
inference: false
|
|
|
|
|
13 |
---
|
14 |
+
|
15 |
<p align="center">
|
16 |
<img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/speakleash_cyfronet.png">
|
17 |
</p>
|
|
|
22 |
AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
|
23 |
Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
|
24 |
|
25 |
+
📚 Technical report: [Bielik v3 Small: Technical Report](https://huggingface.co/papers/2505.02550)
|
26 |
+
|
27 |
+
Project page: https://bielik.ai/
|
28 |
|
29 |
FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
|
30 |
|