Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Qwen
/
Qwen1.5-32B-Chat-AWQ
like
17
Follow
Qwen
3,634
Text Generation
Transformers
Safetensors
English
qwen2
chat
conversational
text-generation-inference
Inference Endpoints
4-bit precision
awq
arxiv:
2309.16609
License:
tongyi-qianwen
Model card
Files
Files and versions
Community
2
Train
Deploy
Use this model
推理速度比14B-AWQ慢很多,是否正常
#1
by
william0014
- opened
Apr 11
Discussion
william0014
Apr 11
同样内容的回复, 14B-AWQ 为 6秒, 32B-AWQ为20秒.
See translation
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment