File size: 2,071 Bytes
d7868a0
 
31bddbd
 
 
 
 
 
 
 
d7868a0
31bddbd
d7868a0
31bddbd
 
d7868a0
31bddbd
d7868a0
31bddbd
d7868a0
31bddbd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
library_name: peft
license: mit
language:
- en
- fr
datasets:
- kaitchup/opus-French-to-English
tags:
- translation
---
# Model Card for Model ID

This is an adapter for Meta's Llama 2 7B fine-tuned for translating French text into English. 
## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
- **Model type:** LoRA Adapter for Llama 2 7B
- **Language(s) (NLP):** French, English
- **License:** MIT license



## Uses

This adapter must be loaded on top of Llama 2 7B. It has been fine-tuned with QLoRA. For optimal results, the base model must be loaded with the exact same configuration used during fine-tuning.
You can use the following code to load the model:
```
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
from peft import PeftModel

base_model = "meta-llama/Llama-2-7b-hf"
compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
        original_model_directory, device_map={"": 0}, quantization_config=bnb_config
)
tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True)
model = PeftModel.from_pretrained(model, "kaitchup/Llama-2-7b-mt-French-to-English")
```

Then, run the model as follows:

```
my_text = "" #put your text to translate here

prompt = my_text+" ###>"

tokenized_input = tokenizer(prompt, return_tensors="pt")
input_ids = tokenized_input["input_ids"].cuda()

generation_output = model.generate(
        input_ids=input_ids,
        num_beams=10,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=130

)
for seq in generation_output.sequences:
    output = tokenizer.decode(seq, skip_special_tokens=True)
    print(output.split("###>")[1].strip()) 
```


## Model Card Contact

[The Kaitchup](https://kaitchup.substack.com/)