Alex Yang
swulling
·
AI & ML interests
None yet
Organizations
None yet
swulling's activity
Works great, much faster inference. Quantization possible?
2
#1 opened 7 months ago
by
jharianto
HF TGI deployment
1
#8 opened 8 months ago
by
austinmw
load_dataset failed
3
#1 opened 7 months ago
by
swulling
In actual testing, compared to using fp16, it is only less than 10% faster
3
#1 opened 8 months ago
by
swulling
How to run on Colab's CPU?
7
#4 opened about 1 year ago
by
deepakkaura26
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63c8ef6c00104ea998d92645/8zVt_tzR2fPgk7s6dA03k.jpeg)
Complete prompt template
6
#4 opened about 1 year ago
by
Nightcall