MARS / README.md
palazski's picture
Update README.md
29dd2a8 verified
---
license: llama3
language:
- tr
- en
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
- name: MARS
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge TR v0.2
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc
value: 46.08
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU TR v0.2
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 47.02
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA TR v0.2
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: acc
name: accuracy
value: 49.38
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande TR v0.2
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 53.71
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k TR v0.2
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 53.08
name: accuracy
pipeline_tag: text-generation
---
<img src="MARS-1.0.png" alt="Curiosity MARS model logo" style="border-radius: 1rem; width: 100%">
<div style="display: flex; justify-content: center; align-items: center; flex-direction: column">
<h1 style="font-size: 5em; margin-bottom: 0; padding-bottom: 0;">MARS</h1>
<aside>by <a href="https://curiosity.tech">Curiosity Technology</a></aside>
</div>
MARS is the first iteration of Curiosity Technology models, based on Llama 3 8B.
We have trained MARS on in-house Turkish dataset, as well as several open-source datasets and their Turkish
translations.
It is our intention to release Turkish translations in near future for community to have their go on them.
MARS have been trained for 3 days on 4xA100.
## Model Details
- **Base Model**: Meta Llama 3 8B Instruct
- **Training Dataset**: In-house & Translated Open Source Turkish Datasets
- **Training Method**: LoRA Fine Tuning
## How to use
You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.
### Transformers pipeline
```python
import transformers
import torch
model_id = "curiositytech/MARS"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
{"role": "user", "content": "Sen kimsin?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
```
### Transformers AutoModelForCausalLM
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "curiositytech/MARS"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
{"role": "user", "content": "Sen kimsin?"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```