File size: 5,682 Bytes
046ec63 ca3b572 046ec63 1008f4e 046ec63 9a9406f e72c3ed afe68f2 e0720e6 aa95bb7 08d4ef2 aa95bb7 e72c3ed e0720e6 e72c3ed 046ec63 e72c3ed 046ec63 9a9406f 046ec63 9a9406f 046ec63 9a9406f 046ec63 9a9406f 046ec63 e0720e6 046ec63 ba33375 046ec63 9a9406f 046ec63 ba33375 046ec63 ba33375 9a9406f ba33375 9a9406f ba33375 9a9406f ba33375 9a9406f 046ec63 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- code
license: apache-2.0
---
`Orkhan/llama-2-7b-absa` is a fine-tuned version of the Llama-2-7b model, optimized for Aspect-Based Sentiment Analysis (ABSA) using a manually labelled dataset of 2000 sentences.
This enhancement equips the model to adeptly identify aspects and accurately analyze sentiment, making it a valuable asset for nuanced sentiment analysis in diverse applications.
Its advantage over traditional Aspect-Based Sentiment Analysis models is you do not need to train a model with domain-specific labeled data as the llama-2-7b-absa model generalizes very well. However, you may need more computing power.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62b58935593a2c49da6b0f5a/G8wDb1I2cWDQf1uo5qfGE.png)
While inferencing, please note that the model has been trained on sentences, not on paragraphs.
It fits T4-GPU-enabled free Google Colab Notebook.
https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing
---
What does it do?
You are prompting a sentence, and getting aspects, opinions, sentiments and phrases (opinion + aspect) in the sentence.
```
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)
>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'aspects': ['weather', 'birds', 'smell'],
'opinions': ['nice', 'flying', 'bad'],
'sentiments': ['Positive', 'Positive', 'Negative'],
'phrases': ['nice weather', 'flying birds', 'bad smell']}
```
# Installing and usage:
install:
```
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
```
import:
```
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
HfArgumentParser,
TrainingArguments,
pipeline,
logging,
)
from peft import LoraConfig, PeftModel
import torch
```
Load model and merge it with LoRa weights
```
model_name = "Orkhan/llama-2-7b-absa"
# load model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
model_name,
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1
```
tokenizer:
```
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
```
For processing input and output, it is recommended to use these ABSA related functions:
```
def process_output(result, user_prompt):
interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()
new_output.split('## Opinion detected:')
aspect_opinion_sentiment = new_output
aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]
aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]
output_dict = {
'user_prompt': user_prompt,
'interpreted_input': interpreted_input,
'aspects': aspect_list,
'opinions': opinion_list,
'sentiments': sentiments_list,
'phrases': phrases
}
return output_dict
def process_prompt(user_prompt, model):
edited_prompt = "### Human: " + user_prompt + '.###'
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
result = pipe(edited_prompt)
output_dict = process_output(result, user_prompt)
return result, output_dict
```
inference:
```
prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)
```
Output:
```
raw_result:
[{'generated_text': '### Human: Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.### Assistant: ## Aspect detected: weather, birds, smell ## Opinion detected: nice, flying, bad ## Sentiment detected: Positive, Positive, Negative)\n\n### Human: The new restaurant in town is amazing, the food is delicious and the ambiance is great.### Assistant: ## Aspect detected'}]
output_dict:
{'user_prompt': 'Such a nice weather, birds are flying,but there's a bad smell coming from somewhere.',
'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
'aspects': ['weather', 'birds', 'smell'],
'opinions': ['nice', 'flying', 'bad'],
'sentiments': ['Positive', 'Positive', 'Negative'],
'phrases': ['nice weather', 'flying birds', 'bad smell']}
```
# Use the whole code in this colab:
- https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing |