arxiv:2405.03594
Michael Goin
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Organizations
Papers
3
spaces
3
models
51
mgoin/Llama-2-7b-chat-hf-pruned95
Text Generation
•
Updated
•
3
mgoin/Llama-2-7b-chat-hf-pruned90
Text Generation
•
Updated
•
1
mgoin/Llama-2-7b-chat-hf-pruned85
Text Generation
•
Updated
•
14
mgoin/Llama-2-7b-chat-hf-pruned80
Text Generation
•
Updated
•
20
mgoin/Llama-2-7b-chat-hf-pruned75
Text Generation
•
Updated
•
8
mgoin/Hermes-2-Pro-Llama-3-8B-Marlin
Text Generation
•
Updated
•
9
•
1
mgoin/Meta-Llama-3-70B-Instruct-Marlin
Text Generation
•
Updated
•
363
•
5
mgoin/Meta-Llama-3-8B-Instruct-Marlin
Text Generation
•
Updated
•
36
mgoin/Meta-Llama-3-70B-Instruct-GPTQ
Text Generation
•
Updated
•
258
•
1
mgoin/TinyLlama-1.1B-Chat-v1.0-pruned50-quant-ds
Text Generation
•
Updated
•
5