Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,6 @@ license: apache-2.0
|
|
9 |
|
10 |
The **Molmo-7B-GPTQ-4bit** model is a transformer-based model fine-tuned for NLP tasks. It has been quantized to 4-bit precision for efficient deployment. This model has been prepared using **bitsandbytes** for 4-bit quantization rather than using **AutoGPTQ**, which does not natively support this model format as of now. The quantization leverages the `BitsAndBytesConfig` from the `transformers` library, enabling highly optimized GPU inference with reduced memory usage.
|
11 |
|
12 |
-
## Model Card
|
13 |
-
|
14 |
<div align="center">
|
15 |
<img src="https://molmo.allenai.org/opengraph-image.png" alt="Model Architecture" width="80%" />
|
16 |
</div>
|
|
|
9 |
|
10 |
The **Molmo-7B-GPTQ-4bit** model is a transformer-based model fine-tuned for NLP tasks. It has been quantized to 4-bit precision for efficient deployment. This model has been prepared using **bitsandbytes** for 4-bit quantization rather than using **AutoGPTQ**, which does not natively support this model format as of now. The quantization leverages the `BitsAndBytesConfig` from the `transformers` library, enabling highly optimized GPU inference with reduced memory usage.
|
11 |
|
|
|
|
|
12 |
<div align="center">
|
13 |
<img src="https://molmo.allenai.org/opengraph-image.png" alt="Model Architecture" width="80%" />
|
14 |
</div>
|