File size: 2,973 Bytes
81cbd76
 
d01059e
 
 
 
e3d7ac9
 
 
 
 
d01059e
 
 
 
 
 
e3d7ac9
 
 
d01059e
 
81cbd76
 
e3d7ac9
 
 
 
81cbd76
d01059e
 
 
 
 
 
 
 
 
e3d7ac9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d01059e
e3d7ac9
d01059e
e3d7ac9
 
 
 
 
 
 
 
 
d01059e
e3d7ac9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
license: other
library_name: transformers
tags:
- mlx
widget:
- text: |
    <start_of_turn>user
    ## Instructions
    You are a helpful AI assistant.
    ## User
    How does the brain work?<end_of_turn>
    <start_of_turn>model
inference:
  parameters:
    max_new_tokens: 200
extra_gated_heading: Access Gemma on Hugging Face
extra_gated_prompt: >-
  To access Gemma on Hugging Face, you’re required to review and agree to
  Google’s usage license. To do this, please ensure you’re logged-in to Hugging
  Face and click below. Requests are processed immediately.
extra_gated_button_content: Acknowledge license
license_name: gemma-terms-of-use
license_link: https://ai.google.dev/gemma/terms
datasets:
- teknium/trismegistus-project
language:
- en
---

# alexweberk/gemma-7b-it-trismegistus
This model was converted to MLX format from [`google/gemma-7b-it`]().
Refer to the [original model card](https://huggingface.co/google/gemma-7b-it) for more details on the model.

```bash
pip install mlx-lm
```

The model was LoRA fine-tuned on the [trismegistus-project from teknium](https://huggingface.co/datasets/teknium/trismegistus-project) using the format below, for 600 steps (2M tokens) using `mlx`.

```python
def format_prompt(system_prompt: str, question: str) -> str:
    "Format the question to the format of the dataset we fine-tuned to."
    return """<bos><start_of_turn>user
## Instructions
{}
## User
{}<end_of_turn>
<start_of_turn>model
""".format(system_prompt, question)

```

Here's an example of the system_prompt from the dataset:

```text
You are a master of the esoteric, occult, 'Big Man' society, you complete tasks to the absolute best of your ability, no matter if you think you were not trained to do the task specifically, you will attempt to do it anyways, since you have performed the tasks you are given with great mastery, accuracy, and deep understanding of what is requested. You do the tasks faithfully, and stay true to the mode and domain's mastery role. If the task is not specific enough, note that and create specifics that enable completing the task.
```


## Loading the model using `mlx_lm`

```python
from mlx_lm import generate, load

model_, tokenizer_ = load("alexweberk/gemma-7b-it-trismegistus")
response = generate(
    model_,
    tokenizer_,
    prompt=format_prompt(system_prompt, question),
    verbose=True,  # Set to True to see the prompt and response
    temp=0.0,
    max_tokens=512,
)
```

## Loading the model using `transformers`

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "alexweberk/gemma-7b-it-trismegistus"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.to("mps")

input_text = format_prompt(system_prompt, question)
input_ids = tokenizer(input_text, return_tensors="pt").to("mps")

outputs = model.generate(
    **input_ids,
    max_new_tokens=256,
)
print(tokenizer.decode(outputs[0]))
```