--- license: mit datasets: - CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70Bmistral language: - en base_model: - mistralai/Mistral-Nemo-Instruct-2407 pipeline_tag: text-generation library_name: transformers --- ## Mistral Nemo 12B R1 ![mistralthink](https://autumn.revolt.chat/attachments/zIqa-Q6gKlwm7BbOvKvFFRLHDdy5OOy30KcU5iFle1/image.png) Took **96 hours** to finetune on **2x Nvidia RTX A6000** with the following settings: - Batch size: 3 - Gradient accumulation steps: 1 - Epochs: 1 - Learning rate: 1e-4 - Warmup ratio: 0.1 Run the model: ```python import torch from transformers import pipeline model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.1" pipe = pipeline( "text-generation", model=model_id, torch_dtype=torch.bfloat16, device_map="auto" ) messages = [ {"role": "system", "content": "You are a helpful AI assistant."}, {"role": "user", "content": "How many r's are in strawberry?"} ] outputs = pipe( messages, temperature=0.2, repetition_penalty=1.1, max_new_tokens=2048 ) print(outputs[0]["generated_text"][-1]) ``` System prompt: ``` You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: {reasoning}{answer} - Process: Think first, then answer. Always use your reasoning capabilities. ```