alokabhishek commited on
Commit
d59c585
1 Parent(s): b2efe14

Updated Readme

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -20,6 +20,9 @@ pipeline_tag: text-generation
20
 
21
  This repo contains 4-bit quantized (using AutoAWQ) model of Meta's meta-llama/Llama-2-7b-chat-hf
22
 
 
 
 
23
  ## Model Details
24
 
25
  - Model creator: [Meta](https://huggingface.co/meta-llama)
@@ -28,7 +31,16 @@ This repo contains 4-bit quantized (using AutoAWQ) model of Meta's meta-llama/Ll
28
 
29
  ### About 4 bit quantization using AutoAWQ
30
 
31
- AutoAWS github repo: [bitsandbytes github repo](https://github.com/casper-hansen/AutoAWQ/tree/main)
 
 
 
 
 
 
 
 
 
32
 
33
  # How to Get Started with the Model
34
 
 
20
 
21
  This repo contains 4-bit quantized (using AutoAWQ) model of Meta's meta-llama/Llama-2-7b-chat-hf
22
 
23
+ AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration is developed by MIT-HAN-Lab
24
+
25
+
26
  ## Model Details
27
 
28
  - Model creator: [Meta](https://huggingface.co/meta-llama)
 
31
 
32
  ### About 4 bit quantization using AutoAWQ
33
 
34
+ AutoAWQ github repo: [AutoAWQ github repo](https://github.com/casper-hansen/AutoAWQ/tree/main)
35
+ MIT-han-lab llm-aws github repo: [MIT-han-lab llm-aws github repo](https://github.com/mit-han-lab/llm-awq/tree/main)
36
+
37
+ @inproceedings{lin2023awq,
38
+ title={AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration},
39
+ author={Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song},
40
+ booktitle={MLSys},
41
+ year={2024}
42
+ }
43
+
44
 
45
  # How to Get Started with the Model
46