emredeveloper
/

DeepSeek-R1-Distill-Qwen-1.5B-4bit

4-bit precision

Model card Files Files and versions Community

emredeveloper commited on 11 days ago

Commit

2390d7d

·

verified ·

1 Parent(s): 0e5a729

Update README.md

Files changed (1) hide show

README.md +6 -13

README.md CHANGED Viewed

@@ -6,6 +6,9 @@ base_model:
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 tags:
 - cot
 ---
 # Model Card for DeepSeek-R1-Distill-Qwen-1.5B-4bit
@@ -17,21 +20,11 @@ This is a 4-bit quantized version of the `deepseek-ai/DeepSeek-R1-Distill-Qwen-1
 ### Model Description
-- **Developed by:** [Your Name or Organization]
-- **Funded by [optional]:** [Your Funding Source, if applicable]
-- **Shared by:** [Your Name or Organization]
 - **Model type:** Transformer-based Language Model
 - **Language(s) (NLP):** English
 - **License:** MIT
 - **Finetuned from model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
-### Model Sources [optional]
-- **Repository:** [Link to your GitHub repository, if applicable]
-- **Paper [optional]:** [Link to the paper, if applicable]
-- **Demo [optional]:** [Link to a live demo, if applicable]
-## Uses
 ### Direct Use
@@ -41,7 +34,7 @@ This model is intended for research and practical applications where memory effi
 - Language understanding tasks
 - Chatbots and conversational AI
-### Downstream Use [optional]
 This model can be fine-tuned for specific tasks such as:
@@ -81,9 +74,9 @@ quantization_config = BitsAndBytesConfig(
 )
 # Load the model and tokenizer
-tokenizer = AutoTokenizer.from_pretrained("your-username/DeepSeek-R1-Distill-Qwen-1.5B-4bit", trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
-    "your-username/DeepSeek-R1-Distill-Qwen-1.5B-4bit",
     quantization_config=quantization_config,
     device_map="auto",
     trust_remote_code=True

 - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 tags:
 - cot
+- r1
+- deepseek
+- text
 ---
 # Model Card for DeepSeek-R1-Distill-Qwen-1.5B-4bit
 ### Model Description
 - **Model type:** Transformer-based Language Model
 - **Language(s) (NLP):** English
 - **License:** MIT
 - **Finetuned from model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
 ### Direct Use
 - Language understanding tasks
 - Chatbots and conversational AI
+### Downstream Use
 This model can be fine-tuned for specific tasks such as:
 )
 # Load the model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("emredeveloper/DeepSeek-R1-Distill-Qwen-1.5B-4bit", trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
+    "emredeveloper/DeepSeek-R1-Distill-Qwen-1.5B-4bit",
     quantization_config=quantization_config,
     device_map="auto",
     trust_remote_code=True