xmadai
/

gemma-2-9b-it-xMADai-INT4

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

onebitquantized commited on 25 days ago

Commit

593d3ff

•

1 Parent(s): dbca506

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -68,10 +68,11 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
 If you found this model useful, please cite our research paper.
 ```
 @article{zhang2024leanquant,
-  title={Leanquant: Accurate large language model quantization with loss-error-aware grid},
   author={Zhang, Tianyi and Shrivastava, Anshumali},
   journal={arXiv preprint arXiv:2407.10032},
-  year={2024}
 }
 ```

 If you found this model useful, please cite our research paper.
 ```
 @article{zhang2024leanquant,
+  title={LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid},
   author={Zhang, Tianyi and Shrivastava, Anshumali},
   journal={arXiv preprint arXiv:2407.10032},
+  year={2024},
+  url={https://arxiv.org/abs/2407.10032},
 }
 ```