CreitinGameplays's picture
Update README.md
29c1f60 verified
|
raw
history blame
1.31 kB
---
license: mit
datasets:
- CreitinGameplays/Raiden-DeepSeek-R1-llama3.1
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
library_name: transformers
---
## Llama 3.1 8B R1 v0.1
![Llama](https://autumn.revolt.chat/attachments/Dpj0Up0lYE2-BVOQRTDXeLk5xa7EE0WxBntXqgJGAo/DALL%C2%B7E%202025-02-19%2010.03.42%20-%20A%20futuristic%20robotic%20white%20llama%20with%20sleek%20metallic%20plating%20and%20glowing%20blue%20eyes.%20The%20llama%20has%20intricate%20mechanical%20joints%20and%20a%20high-tech%20design.%20.png)
Took **28 hours** to finetune on **2x Nvidia RTX A6000** with the following settings:
- Batch size: 8
- Gradient accumulation steps: 1
- Epochs: 2
- Learning rate: 1e-4
- Warmup ratio: 0.1
Run the model:
```python
import torch
from transformers import pipeline
model_id = "CreitinGameplays/Llama-3.1-8B-R1-v0.1"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "system", "content": "You are an AI assistant named Llama, made by Meta AI."},
{"role": "user", "content": "How many r's are in strawberry?"}
]
outputs = pipe(
messages,
temperature=0.5,
repetition_penalty=1.1,
max_new_tokens=2048
)
print(outputs[0]["generated_text"][-1])
```