Daniel Han-Chen
danielhanchen
AI & ML interests
None yet
Organizations
danielhanchen's activity
Is this model native 128K context length, or YaRN extended?
4
#28 opened 5 days ago
by
danielhanchen

LM Studio vs llama.cpp different results?
6
#5 opened 3 days ago
by
urtuuuu
recommended generation parameters
6
#5 opened 6 days ago
by
erichartford

QwQ-32B-Q5_K_M Cyclically thinking
6
#2 opened 5 days ago
by
yorktown
Is `rms_norm_eps` 1e-5 or 1e-6
#9 opened 6 days ago
by
danielhanchen

EOS token should be <|end|>
3
#1 opened 9 days ago
by
Mungert

Are the Q4 and Q5 models R1 or R1-Zero
18
#2 opened about 2 months ago
by
gng2info
fix position embeddings
3
#1 opened about 2 months ago
by
PatentPilotAI
I loaded DeepSeek-V3-Q5_K_M up on my 10yrs old old Tesla M40 (Dell C4130)
3
#8 opened 2 months ago
by
gng2info
Suggested tokenizer changes by Unsloth.ai
7
#21 opened about 2 months ago
by
gugarosa

Getting error with Q3-K-M
7
#2 opened 2 months ago
by
alain401
Advice on running llama-server with Q2_K_L quant
3
#6 opened 2 months ago
by
vmajor

llama.cpp cannot load Q6_K model
5
#3 opened 2 months ago
by
vmajor

Big thanks for these "without original" uploads!
1
#1 opened 3 months ago
by
jukofyork

Aphrodite/VLLM/SGLang all refuse to load this model
2
#5 opened 6 months ago
by
fullstack
No module named 'triton'
1
#3 opened 6 months ago
by
NeelM0906
update base_model
#1 opened 6 months ago
by
davanstrien

Cant use the tokenizer using Unsloth Fastmodel
2
#2 opened 7 months ago
by
aryarishit