Qiyu Zhong's picture

1 1 9

Qiyu Zhong

allenzhong1214

·

ZhongQiyu

AI & ML interests

Natural Language Processing Computer Vision Graph

Recent Activity

upvoted an article about 1 month ago

Introduction to State Space Models (SSM)

reacted to nicolay-r's post with 👍 about 1 month ago

📢 For those who wish to launch distilled DeepSeek R1 for reasoning with schema, sharing the Google Colab notebook: 📙 https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_colab.ipynb This is a wrapper of the Qwen2 transformers 🤗 provider via bulk-chain framework. Model: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B GPU: T4 (15GB) is nearly enough in float32 mode. 🚀 To boost the performance you may set bf16 mode (use_bf16=True) 🌟 Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain

liked a model 3 months ago

TsinghuaAI/CPM-Generate

View all activity

Organizations

allenzhong1214's activity

upvoted an article about 1 month ago

Article

Introduction to State Space Models (SSM)

By

•

Jul 19, 2024

• 111

reacted to nicolay-r's post with 👍 about 1 month ago

Post

1456

📢 For those who wish to launch distilled DeepSeek R1 for reasoning with schema, sharing the Google Colab notebook:
📙 https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_colab.ipynb
This is a wrapper of the Qwen2 transformers 🤗 provider via bulk-chain framework.
Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
GPU: T4 (15GB) is nearly enough in float32 mode.
🚀 To boost the performance you may set bf16 mode (use_bf16=True)
🌟 Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain

liked a model 3 months ago

TsinghuaAI/CPM-Generate

Text Generation • Updated Jul 29, 2021 • 455 • 42

liked a dataset 3 months ago

fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 12k • 7.6k

New activity in internlm/internlm-chat-20b-4bit 6 months ago

although tokenizer.model exists , deploy is complaining that it does not

#8 opened over 1 year ago by

liked a model 6 months ago

internlm/internlm2-chat-1_8b

Text Generation • Updated Aug 20, 2024 • 6.26k • 31

liked a model 10 months ago

tae898/emoberta-large

Text Classification • Updated Mar 16, 2022 • 410 • 7

liked a model 11 months ago

moka-ai/m3e-base

Updated Jul 14, 2023 • 164k • 924

liked a dataset 11 months ago

m-a-p/COIG-CQIA

Viewer • Updated Apr 18, 2024 • 44.7k • 5.49k • 611

liked a model 11 months ago

google/gemma-2b-it

Text Generation • Updated Sep 27, 2024 • 121k • • 708

liked 2 models 12 months ago

sonoisa/t5-base-japanese

Text2Text Generation • Updated Dec 12, 2024 • 7.16k • • 49

Qwen/Qwen-72B

Text Generation • Updated Oct 9, 2024 • 4.7k • 353