Michael Goin

mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Organizations

mgoin's activity

New activity in neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV about 12 hours ago

How to load this model?

1
#1 opened about 18 hours ago by Frz614

Update README.md

#1 opened 8 days ago by alexmarques

Update README.md

#1 opened 8 days ago by alexmarques
New activity in neuralmagic/Qwen2-72B-Instruct-FP8 21 days ago

Update README.md

#1 opened 21 days ago by abhinavnmagic
New activity in neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8 22 days ago

Update README.md

#1 opened 22 days ago by abhinavnmagic
New activity in neuralmagic/Meta-Llama-3-8B-Instruct-FP8 22 days ago

Update README.md

#2 opened 22 days ago by abhinavnmagic
New activity in neuralmagic/Meta-Llama-3-70B-Instruct-FP8 22 days ago

Create README.md

#1 opened 22 days ago by abhinavnmagic
New activity in neuralmagic/Meta-Llama-3-8B-Instruct-FP8 25 days ago

Fails to run with nm-vllm

1
#1 opened about 2 months ago by clintonruairi
New activity in mgoin/ultrachat_2k about 1 month ago
New activity in neuralmagic/Mistral-7B-Instruct-v0.3-GPTQ-4bit about 1 month ago

Inference GPU Ram requirement >60GB

1
#1 opened about 1 month ago by Ksgk-fy
New activity in mgoin/Meta-Llama-3-70B-Instruct-Marlin 2 months ago

What is Marlin?

2
#1 opened 2 months ago by Samvanity
New activity in mgoin/Meta-Llama-3-8B-Instruct-Marlin 2 months ago

Inference Issues

7
#1 opened 3 months ago by qeternity
New activity in neuralmagic/Llama-2-7b-evolcodealpaca 4 months ago

Update README.md

#1 opened 4 months ago by abhinavnmagic

Update README.md

#1 opened 4 months ago by abhinavnmagic

Update README.md

#1 opened 4 months ago by abhinavnmagic

Update README.md

#1 opened 4 months ago by alexmarques

Update README.md

#1 opened 4 months ago by alexmarques
New activity in neuralmagic/Llama-2-7b-pruned70-retrained 4 months ago

Update README.md

#1 opened 4 months ago by alexmarques
New activity in neuralmagic/Llama-2-7b-pruned50-retrained 4 months ago

Update README.md

#1 opened 4 months ago by alexmarques
New activity in reciprocate/llama2-7b-gsm8k 9 months ago

Create README.md

#2 opened 9 months ago by mgoin