Daniel Han-Chen's picture

Daniel Han-Chen

danielhanchen

·

https://unsloth.ai/

danielhanchen

AI & ML interests

None yet

Organizations

danielhanchen's activity

New activity in Qwen/QwQ-32B 2 days ago

Is this model native 128K context length, or YaRN extended?

#28 opened 5 days ago by

New activity in unsloth/QwQ-32B-GGUF 2 days ago

LM Studio vs llama.cpp different results?

#5 opened 3 days ago by

New activity in Qwen/QwQ-32B 3 days ago

recommended generation parameters

#5 opened 6 days ago by

New activity in unsloth/QwQ-32B-GGUF 3 days ago

QwQ-32B-Q5_K_M Cyclically thinking

#2 opened 5 days ago by

New activity in unsloth/QwQ-32B 5 days ago

I could be wrong, but I think the <think> tag needs to be removed from the last bit of the jinja template in the tokenizer_config.json

#1 opened 5 days ago by

New activity in Qwen/QwQ-32B 6 days ago

Is `rms_norm_eps` 1e-5 or 1e-6

#9 opened 6 days ago by

New activity in unsloth/Phi-4-mini-instruct-GGUF 8 days ago

EOS token should be <|end|>

#1 opened 9 days ago by

New activity in unsloth/DeepSeek-R1-GGUF about 1 month ago

Are the Q4 and Q5 models R1 or R1-Zero

#2 opened about 2 months ago by

New activity in unsloth/Mistral-Nemo-Instruct-2407 about 2 months ago

fix position embeddings

#1 opened about 2 months ago by

New activity in unsloth/DeepSeek-V3-GGUF about 2 months ago

I loaded DeepSeek-V3-Q5_K_M up on my 10yrs old old Tesla M40 (Dell C4130)

#8 opened 2 months ago by

New activity in microsoft/phi-4 about 2 months ago

Suggested tokenizer changes by Unsloth.ai

#21 opened about 2 months ago by

New activity in unsloth/DeepSeek-V3-GGUF 2 months ago

Getting error with Q3-K-M

#2 opened 2 months ago by

Advice on running llama-server with Q2_K_L quant

#6 opened 2 months ago by

llama.cpp cannot load Q6_K model

#3 opened 2 months ago by

New activity in unsloth/Llama-3.3-70B-Instruct 3 months ago

Big thanks for these "without original" uploads!

#1 opened 3 months ago by

New activity in unsloth/gemma-2-27b-it-bnb-4bit 6 months ago

Aphrodite/VLLM/SGLang all refuse to load this model

#5 opened 6 months ago by

New activity in unsloth/gemma-7b-bnb-4bit 6 months ago

No module named 'triton'

#3 opened 6 months ago by

New activity in unsloth/Hermes-3-Llama-3.1-8B-bnb-4bit 6 months ago

update base_model

#1 opened 6 months ago by

New activity in unsloth/mistral-7b-instruct-v0.3 6 months ago

ValueError: The following `model_kwargs` are not used by the model: ['num_logits_to_keep'] (note: typos in the generate arguments will also show up in this list)

#1 opened 6 months ago by

New activity in unsloth/Phi-3-mini-4k-instruct-v0-bnb-4bit 7 months ago

Cant use the tokenizer using Unsloth Fastmodel

#2 opened 7 months ago by