nielsr HF Staff commited on
Commit
8aecf67
·
verified ·
1 Parent(s): c2af16d

Improve model card with paper and project page links

Browse files

This PR improves the model card by adding links to the paper and project page for better context and accessibility.

Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -1,16 +1,17 @@
1
  ---
 
2
  language:
3
  - pl
4
- license: apache-2.0
5
  library_name: transformers
 
 
6
  tags:
7
  - finetuned
8
  - gguf
9
  - 8bit
10
  inference: false
11
- pipeline_tag: text-generation
12
- base_model: speakleash/Bielik-4.5B-v3.0-Instruct
13
  ---
 
14
  <p align="center">
15
  <img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/speakleash_cyfronet.png">
16
  </p>
@@ -21,7 +22,9 @@ This model was obtained by quantizing the weights and activations of [Bielik-4.5
21
  AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
22
  Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
23
 
24
- 📚 Technical report: https://arxiv.org/abs/2505.02550
 
 
25
 
26
  FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
27
 
 
1
  ---
2
+ base_model: speakleash/Bielik-4.5B-v3.0-Instruct
3
  language:
4
  - pl
 
5
  library_name: transformers
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
  tags:
9
  - finetuned
10
  - gguf
11
  - 8bit
12
  inference: false
 
 
13
  ---
14
+
15
  <p align="center">
16
  <img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/speakleash_cyfronet.png">
17
  </p>
 
22
  AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
23
  Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
24
 
25
+ 📚 Technical report: [Bielik v3 Small: Technical Report](https://huggingface.co/papers/2505.02550)
26
+
27
+ Project page: https://bielik.ai/
28
 
29
  FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
30