MARS

---
license: llama3
language:
- tr
- en
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
- name: MARS
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge TR v0.2
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc
      value: 46.08
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU TR v0.2
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 47.02
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA TR v0.2
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: acc
      name: accuracy
      value: 49.38
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande TR v0.2
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 53.71
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k TR v0.2
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 53.08
      name: accuracy
pipeline_tag: text-generation
---


<img src="MARS-1.0.png" alt="Curiosity MARS model logo" style="border-radius: 1rem; width: 100%">


<div style="display: flex; justify-content: center; align-items: center; flex-direction: column">
    <h1 style="font-size: 5em; margin-bottom: 0; padding-bottom: 0;">MARS</h1>
    <aside>by <a href="https://curiosity.tech">Curiosity Technology</a></aside>
</div>

MARS is the first iteration of Curiosity Technology models, based on Llama 3 8B.

We have trained MARS on in-house Turkish dataset, as well as several open-source datasets and their Turkish
translations.
It is our intention to release Turkish translations in near future for community to have their go on them.

MARS have been trained for 3 days on 4xA100.

## Model Details

- **Base Model**: Meta Llama 3 8B Instruct
- **Training Dataset**: In-house & Translated Open Source Turkish Datasets
- **Training Method**: LoRA Fine Tuning


## How to use

You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.

### Transformers pipeline

```python
import transformers
import torch

model_id = "curiositytech/MARS"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
    {"role": "user", "content": "Sen kimsin?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
```

### Transformers AutoModelForCausalLM

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "curiositytech/MARS"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
    {"role": "user", "content": "Sen kimsin?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```