File size: 4,452 Bytes
ccd089c
 
a785c6f
ccd089c
 
 
 
 
 
 
fb52d4f
9a78c83
 
 
 
 
a785c6f
9a78c83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d6c752
9a78c83
 
 
 
 
 
 
 
 
 
 
 
 
 
fb52d4f
9a78c83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37420be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9a78c83
 
 
37420be
9a78c83
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
datasets:
- Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled
language:
- en
pipeline_tag: text-generation
tags:
- absa
- qlora
---
# llama-2-7b-absa-semeval-2016

## Model Details

- **Model Name:** Alpaca69B/llama-2-7b-absa-semeval-2016
- **Base Model:** NousResearch/Llama-2-7b-chat-hf
- **Fine-Tuned On:** Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled
- **Fine-Tuning Techniques:** LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc.
- **Training Resources:** Low resource usage

## Model Description

This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset.

## Fine-Tuning Techniques

### LoRA Attention
- LoRA attention dimension: 64
- Alpha parameter for LoRA scaling: 16
- Dropout probability for LoRA layers: 0.1

### bitsandbytes (4-bit precision)
- Activated 4-bit precision base model loading
- Compute dtype for 4-bit base models: "float16"
- Quantization type: "nf4"
- Nested quantization for 4-bit base models: Disabled

### TrainingArguments
- Output directory: "./results"
- Number of training epochs: 2
- Enabled fp16/bf16 training: False
- Batch size per GPU for training: 4
- Batch size per GPU for evaluation: 4
- Gradient accumulation steps: 1
- Enabled gradient checkpointing: True
- Maximum gradient norm (gradient clipping): 0.3
- Initial learning rate: 2e-4
- Weight decay: 0.001
- Optimizer: paged_adamw_32bit
- Learning rate scheduler: cosine
- Maximum training steps: -1 (overrides num_train_epochs)
- Ratio of steps for linear warmup: 0.03
- Group sequences into batches with the same length: True
- Save checkpoint every X update steps: 0 (disabled)
- Log every X update steps: 100

### SFT (Sequence-level Fine-Tuning)
- Maximum sequence length: Not specified
- Packing multiple short examples in the same input sequence: False
- Load the entire model on GPU 0

## Evaluation

The model's performance and usage can be observed in the provided [Google Colab notebook](https://colab.research.google.com/drive/1ArLpQFfXJiHcAT3VuYZndDqvskM6SkOM?usp=sharing).

## Model Usage

To use the model, follow the provided code snippet:

```python
from transformers import AutoTokenizer
import transformers
import torch

model = "Alpaca69B/llama-2-7b-absa-semeval-2016"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

def process_user_prompt(input_sentence):
    sequences = pipeline(
        f'### Human: {input_sentence} ### Assistant: aspect: ',
        do_sample=True,
        top_k=10,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        max_length=200,
    )
    result_dict = process_output(sequences[0]['generated_text'])
    return result_dict

def process_output(output):
    result_dict = {}

   # Extract user_prompt
    user_prompt_start = output.find("### Human:")
    user_prompt_end = output.find("aspect: ") + len("aspect: ")
    result_dict['user_prompt'] = output[user_prompt_start:user_prompt_end].strip()

    # Extract cleared_generated_output
    cleared_output_end = output.find(")")
    result_dict['cleared_generated_output'] = output[:cleared_output_end+1].strip()

    # Extract review
    human_start = output.find("Human:") + len("Human:")
    assistant_start = output.find("### Assistant:")
    result_dict['review'] = output[human_start:assistant_start].strip()

    # Extract aspect and sentiment
    aspect_start = output.find("aspect: ") + len("aspect: ")
    sentiment_start = output.find("sentiment: ")
    aspect_text = output[aspect_start:sentiment_start].strip()
    result_dict['aspect'] = aspect_text

    sentiment_end = output[sentiment_start:].find(")") + sentiment_start
    sentiment_text = output[sentiment_start+len("sentiment:"):sentiment_end].strip()
    result_dict['sentiment'] = sentiment_text

    return result_dict


output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.')
print(output)

```

## Fine-Tuning Details

Details of the fine-tuning process are available in the [fine-tuning Colab notebook](https://colab.research.google.com/drive/1PQfBsDyM8TSSBchL6PPA4o6rOyLFLUnu?usp=sharing).


**Note:** Ensure that you have the necessary dependencies and resources before running the model.