File size: 2,402 Bytes
6fcb063
 
 
 
 
 
 
 
 
8a2f868
 
 
 
 
 
 
1404ce3
8a2f868
 
1404ce3
 
 
 
 
 
 
 
 
 
 
 
6c57aa3
 
1404ce3
 
 
 
 
 
a3cdb5a
6c57aa3
a3cdb5a
6c57aa3
a3cdb5a
6c57aa3
a3cdb5a
1404ce3
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
datasets:
- databricks/databricks-dolly-15k
- nomic-ai/gpt4all-j-prompt-generations
- open_assist_plus_safety
- oasst
- etc.
language:
- en
---

- Model is finetuned from pygmalion-6B using LoRA and 8-bit quantization on an RTX 2080Ti
- LoRA parameters: r=8, alpha=16, dropout=0.05, bias=None
- All prompts are formatted to conversations pairs of [USER] and [Yaya]. For example:

[User]: Compose a long Gulf War story set in Istanbul, where a content chair encounters Elizabeth Barrett Browning, inspired by the works of Henrik Ibsen.

[Yaya]: As an old and content chair sat in the corner of a bustling coffee house in Istanbul, he witnessed the city's frenzied energy. It was the early 1990s, and the Gulf War had just begun.

There was great political strife in the air, and tensions were high. People rushed about, frantically discussing the war and its potential impact on their lives. [...]

- Load LoRA weights with PEFT model
```
        from transformers import  GPTJForCausalLM,AutoTokenizer, GenerationConfig
        from peft import PeftModel

        lora_weights = 'kietbs/pygmalion_6B_yaya'  # Please download the weight, and change this path accordingly
        load_in_8bit = True
        model = GPTJForCausalLM.from_pretrained(pretrain_name, load_in_8bit=load_in_8bit, device_map='auto', torch_dtype=torch.float16)
        model = PeftModel.from_pretrained(model,lora_weights,torch_dtype=torch.float16,device_map={'':0})
        model = torch.compile(model)
        tokenizer = AutoTokenizer.from_pretrained('pygmalion-6b') #The orginal pretrained
        gen_config=GenerationConfig(
                temperature=0.1,
                top_p=0.75,
                top_k=40,
                num_beams=4
            )

        text = '[User]: What's the best food in Hanoi?''
        input_ids = tokenizer(text, return_tensors='pt')['input_ids'].to('cuda')
        with torch.no_grad():
            output = model.generate(input_ids=input_ids, generation_config=gen_config,return_dict_in_generate=True, output_scores=True,max_new_tokens=256)
            s = output.sequences[0]
            output = tokenizer.decode(s)
            print('Raw:',output)
        
```

Output:
[User]: What's the best food in Hanoi?
[Yaya]: The best food in Hanoi can vary depending on what you're looking for. Some of the most popular dishes include pho, banh mi, banh xeo, and bún chả.<|endoftext|>