deepseek-ai/DeepSeek-R1

Knowledge / Training Cutoff of DeepSeek R1

#212 opened 6 days ago by

MengboZhou

Update README.md

#211 opened 9 days ago by

Rainbowbeast

Make config params float to avoid warning in Transformers

#210 opened 11 days ago by

Rocketknight1

Update README.md

#209 opened 12 days ago by

Brokersponsor

Update README.md

#207 opened 21 days ago by

mehdi131

Update README.md

#206 opened 23 days ago by

YUIHG

DeepSeek中o1-1217的数据是哪里来的。我好像没在OpenAI的官方途径找到，谢谢🙏

3

#205 opened 25 days ago by

747860199qq

Any R1 reasoning researchers looking for samples?

#204 opened 25 days ago by

natcolley

Update README.md

#203 opened 26 days ago by

umar759

Request: DOI

#202 opened 26 days ago by

Yenugu12

Create 9889555

#201 opened 27 days ago by

keyi8

Upload 657f0f06e7ea1b09462a7a16_Feedback and evaluation-p-500.png

#200 opened 29 days ago by

likhonsheikh

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level

#198 opened about 1 month ago by

wangxingjun778

DeepSeek R1 full-power version occasionally ends without returning </think>.

#196 opened about 1 month ago by

yizhiezi

deepseek满血版偶现结束没有返回 </think>

1

#195 opened about 1 month ago by

yizhiezi

Standing at a flag in Netherlands

#194 opened about 1 month ago by

Sweetstacg

Delete Config.json

#193 opened about 1 month ago by

jana0010

Update README.md

#192 opened about 1 month ago by

caraanchoa

为助手回答添加 <think>\n> 标签，确保一致性

#191 opened about 1 month ago by

REN0430

fix for transformers 4.49 compatibility

#189 opened about 1 month ago by

katuni4ka

MLLM交流群

#188 opened about 1 month ago by

YLHX

Question about experts select

#186 opened about 1 month ago by

waynebian

Hardware Requirements to run the original model - 671B params

4

#185 opened about 1 month ago by

EdilCamil

Holding paper in hand

1

#184 opened about 1 month ago by

Loveyl

Update config.json

#182 opened about 1 month ago by

Empolean2640

Regression in Reasoning Tag Output - Missing <think> in Model Responses

2

#181 opened about 1 month ago by

divinerapier

Delete model.safetensors.index.json

#180 opened about 1 month ago by

Huggingfaceliaj

Unknown quantization type, got fp8

#179 opened about 1 month ago by

DenisFavaCerchiaro

如何取消/省略<think></think>过程。

3

#178 opened about 1 month ago by

yech520

Request: DOI

#177 opened about 2 months ago by

Tamwyn

Request: DOI

#176 opened about 2 months ago by

saathwik

Request: DOI

#175 opened about 2 months ago by

Paulabad

Draft model as accelerator for DeepSeek-R1?

4

#174 opened about 2 months ago by

inputout

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

8

#171 opened about 2 months ago by

samagra-tensorfuse

是否可以关注Perplexity推出的“r1-1776”模型？

4

#170 opened about 2 months ago by

yanyihan

Just crossed 10,000 likes!

1

#169 opened about 2 months ago by

clem

mac上面无法下载flash_attn

#168 opened about 2 months ago by

earlyIsLate

Can this model be used for commercial use?

2

#167 opened about 2 months ago by

henrycwf

90+ tokens per second for MI300x8 using batch_size = 1

1

#166 opened about 2 months ago by

ghostplant

RytryR1

#165 opened about 2 months ago by

Rocka01

"aha moment" comment deleted by Perplexity (recovered)

3

#164 opened about 2 months ago by

FalconNet

输出乱码

1

#163 opened about 2 months ago by

cell22

'num_hidden_layers': 61, but layer 62 has weights.

#162 opened about 2 months ago by

xinhe

Upload GTG Breaking every Limit

#161 opened about 2 months ago by

GTGenesis

support prefix complete

3

#158 opened about 2 months ago by

HuggineAllen

Create app.py

#157 opened about 2 months ago by

SpaceAgeRobotics

Create 1

#156 opened about 2 months ago by

madevii

Brokersponsor

#155 opened about 2 months ago by

Brokersponsor

Update README.md

#154 opened about 2 months ago by

egegvner

Upload IMG_4530.png

#152 opened about 2 months ago by

Noemie202586

Knowledge / Training Cutoff of DeepSeek R1

Update README.md

Make config params float to avoid warning in Transformers

Update README.md

Update README.md

Update README.md

DeepSeek中o1-1217的数据是哪里来的。我好像没在OpenAI的官方途径找到，谢谢🙏

Any R1 reasoning researchers looking for samples?

Update README.md

Request: DOI

Create 9889555

Upload 657f0f06e7ea1b09462a7a16_Feedback and evaluation-p-500.png

Best practice for R1 models evaluation: Reasoning efficiency and Performance by MATH-Level

DeepSeek R1 full-power version occasionally ends without returning </think>.

deepseek满血版偶现结束没有返回 </think>

Standing at a flag in Netherlands

Delete Config.json

Update README.md

为助手回答添加 <think>\n> 标签，确保一致性

fix for transformers 4.49 compatibility

MLLM交流群

Question about experts select

Hardware Requirements to run the original model - 671B params

Holding paper in hand

Update config.json

Regression in Reasoning Tag Output - Missing <think> in Model Responses

Delete model.safetensors.index.json

Unknown quantization type, got fp8

如何取消/省略<think></think>过程。

Request: DOI

Request: DOI

Request: DOI

Draft model as accelerator for DeepSeek-R1?

Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)

是否可以关注Perplexity推出的“r1-1776”模型？

Just crossed 10,000 likes!

mac上面无法下载flash_attn

Can this model be used for commercial use?

90+ tokens per second for MI300x8 using batch_size = 1

RytryR1

"aha moment" comment deleted by Perplexity (recovered)

输出乱码

'num_hidden_​​layers': 61, but layer 62 has weights.

Upload GTG Breaking every Limit

support prefix complete

Create app.py

Create 1

Brokersponsor

Update README.md

Upload IMG_4530.png

'num_hidden_layers': 61, but layer 62 has weights.