|
--- |
|
base_model: Haleshot/Mathmate-7B-DELLA-ORPO |
|
tags: |
|
- finetuned |
|
- orpo |
|
- everyday-conversations |
|
datasets: |
|
- HuggingFaceTB/everyday-conversations-llama3.1-2k |
|
license: apache-2.0 |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Mathmate-7B-DELLA-ORPO-D |
|
|
|
Mathmate-7B-DELLA-ORPO-D is a finetuned version of [Haleshot/Mathmate-7B-DELLA-ORPO](https://huggingface.co/Haleshot/Mathmate-7B-DELLA-ORPO) using the ORPO method, combined with a LoRA adapter trained on everyday conversations. |
|
|
|
## Model Details |
|
|
|
- **Base Model:** [Haleshot/Mathmate-7B-DELLA-ORPO](https://huggingface.co/Haleshot/Mathmate-7B-DELLA-ORPO) |
|
- **Training Dataset:** [HuggingFaceTB/everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k) |
|
|
|
## Dataset |
|
|
|
The model incorporates training on the [HuggingFaceTB/everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k) dataset, which focuses on everyday conversations and small talk. |
|
|
|
## Usage |
|
|
|
Here's an example of how to use the model: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
model_name = "Haleshot/Mathmate-7B-DELLA-ORPO-ORPO-D" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") |
|
|
|
def generate_response(prompt, max_length=512): |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**inputs, max_length=max_length, num_return_sequences=1, do_sample=True, temperature=0.7) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
prompt = "Let's have a casual conversation about weekend plans." |
|
response = generate_response(prompt) |
|
print(response) |
|
``` |
|
|
|
## Acknowledgements |
|
|
|
Thanks to the HuggingFaceTB team for providing the everyday conversations dataset used in this finetuning process. |