deepseek-ai
/

DeepSeek-R1-Distill-Qwen-1.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (6)

Fine Tuning

#28 opened about 21 hours ago by

Error:Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered

#27 opened 8 days ago by

Distillation dataset released?

#26 opened 13 days ago by

bos_token_id mismatch between model config and tokenizer

#25 opened 13 days ago by

JAX Implementation!

#24 opened 18 days ago by

Step by step guide for Distillation

#23 opened 20 days ago by

max_position_embeddings and tokenizer max discrepancies

#22 opened 22 days ago by

R1 not putting out the full model response with transformers pipeline

#21 opened 24 days ago by

Using it on mobile

#20 opened 25 days ago by

update Qwen/Qwen2.5-1.5B-Instruct to deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, i got an error

#19 opened 26 days ago by

model.safetensors file's tensors are all odd aligned causing performance issues when mmap'ing the file in place

#18 opened 29 days ago by

TypeError: forward() missing 1 required positional argument: 'attention_masks'

#16 opened about 1 month ago by

Using the Model

#14 opened about 1 month ago by

Failed to load the model

#13 opened about 1 month ago by

Upload IMG_1608.jpeg

#12 opened about 1 month ago by

how to fine tune?

#10 opened about 1 month ago by

How to turn off the r1 mode when running it with huggingface api?

#9 opened about 1 month ago by

Add pipeline tag, link to paper

#7 opened about 1 month ago by

comfyui-deepseek-r1

#6 opened about 1 month ago by

is `config.json` correct?

#4 opened about 1 month ago by

System Prompt

#3 opened about 1 month ago by

YAML Metadata Warning: empty or missing yaml metadata in repo card

#2 opened about 1 month ago by