File size: 5,682 Bytes
046ec63
 
 
 
ca3b572
046ec63
 
1008f4e
046ec63
9a9406f
e72c3ed
afe68f2
e0720e6
aa95bb7
08d4ef2
aa95bb7
e72c3ed
e0720e6
 
e72c3ed
046ec63
e72c3ed
 
046ec63
9a9406f
046ec63
 
 
9a9406f
 
046ec63
9a9406f
046ec63
9a9406f
046ec63
 
 
e0720e6
046ec63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ba33375
046ec63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9a9406f
046ec63
ba33375
 
046ec63
 
 
 
ba33375
9a9406f
ba33375
9a9406f
 
ba33375
9a9406f
ba33375
9a9406f
046ec63
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- code
license: apache-2.0
---
`Orkhan/llama-2-7b-absa` is a fine-tuned version of the Llama-2-7b model, optimized for Aspect-Based Sentiment Analysis (ABSA) using a manually labelled dataset of 2000 sentences. 
This enhancement equips the model to adeptly identify aspects and accurately analyze sentiment, making it a valuable asset for nuanced sentiment analysis in diverse applications.
Its advantage over traditional Aspect-Based Sentiment Analysis models is you do not need to train a model with domain-specific labeled data as the llama-2-7b-absa model generalizes very well. However, you may need more computing power.


![image/png](https://cdn-uploads.huggingface.co/production/uploads/62b58935593a2c49da6b0f5a/G8wDb1I2cWDQf1uo5qfGE.png)

While inferencing, please note that the model has been trained on sentences, not on paragraphs.
It fits T4-GPU-enabled free Google Colab Notebook.
https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing
---

What does it do?
You are prompting a sentence, and getting aspects, opinions, sentiments and phrases (opinion + aspect) in the sentence.
```
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)

>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'aspects': ['weather', 'birds', 'smell'],
    'opinions': ['nice', 'flying', 'bad'],
    'sentiments': ['Positive', 'Positive', 'Negative'],
    'phrases': ['nice weather', 'flying birds', 'bad smell']}

```

# Installing and usage:

install:

```
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
```

import:
```
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel
import torch
```

Load model and merge it with LoRa weights
```
model_name = "Orkhan/llama-2-7b-absa"
# load model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1
```

tokenizer:
```
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
```

For processing input and output, it is recommended to use these ABSA related functions:
```
def process_output(result, user_prompt):
    interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
    new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()

    new_output.split('## Opinion detected:')

    aspect_opinion_sentiment = new_output

    aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
    opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
    sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]


    aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
    opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
    sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
    phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]

    output_dict = {
        'user_prompt': user_prompt,
        'interpreted_input': interpreted_input,
        'aspects': aspect_list,
        'opinions': opinion_list,
        'sentiments': sentiments_list,
        'phrases': phrases
    }

    return output_dict


def process_prompt(user_prompt, model):
    edited_prompt = "### Human: " + user_prompt + '.###'
    pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
    result = pipe(edited_prompt)

    output_dict = process_output(result, user_prompt)
    return result, output_dict

```

inference:
```
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)
```

Output:
```
raw_result:
  [{'generated_text': '### Human: Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.### Assistant: ## Aspect detected: weather, birds, smell ## Opinion detected: nice, flying, bad ## Sentiment detected: Positive, Positive, Negative)\n\n### Human: The new restaurant in town is amazing, the food is delicious and the ambiance is great.### Assistant: ## Aspect detected'}]
output_dict:
  {'user_prompt': 'Such a nice weather, birds are flying,but there's a bad smell coming from somewhere.',
  'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
  'aspects': ['weather', 'birds', 'smell'],
  'opinions': ['nice', 'flying', 'bad'],
  'sentiments': ['Positive', 'Positive', 'Negative'],
  'phrases': ['nice weather', 'flying birds', 'bad smell']}
```


# Use the whole code in this colab:
- https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing