Edit model card

Orkhan/llama-2-7b-absa is a fine-tuned version of the Llama-2-7b model, optimized for Aspect-Based Sentiment Analysis (ABSA) using a manually labelled dataset of 2000 sentences. This enhancement equips the model to adeptly identify aspects and accurately analyze sentiment, making it a valuable asset for nuanced sentiment analysis in diverse applications. Its advantage over traditional Aspect-Based Sentiment Analysis models is you do not need to train a model with domain-specific labeled data as the llama-2-7b-absa model generalizes very well. However, you may need more computing power.

image/png

While inferencing, please note that the model has been trained on sentences, not on paragraphs. It fits T4-GPU-enabled free Google Colab Notebook. https://colab.research.google.com/drive/1OvfnrufTAwSv3OnVxR-j7o10OKCSM1X5?usp=sharing

What does it do? You are prompting a sentence, and getting aspects, opinions, sentiments and phrases (opinion + aspect) in the sentence.

prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)

>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'aspects': ['weather', 'birds', 'smell'],
    'opinions': ['nice', 'flying', 'bad'],
    'sentiments': ['Positive', 'Positive', 'Negative'],
    'phrases': ['nice weather', 'flying birds', 'bad smell']}

Installing and usage:

install:

!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7

import:

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel
import torch

Load model and merge it with LoRa weights

model_name = "Orkhan/llama-2-7b-absa"
# load model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1

tokenizer:

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

For processing input and output, it is recommended to use these ABSA related functions:

def process_output(result, user_prompt):
    interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
    new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()

    new_output.split('## Opinion detected:')

    aspect_opinion_sentiment = new_output

    aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
    opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
    sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]


    aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
    opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
    sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
    phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]

    output_dict = {
        'user_prompt': user_prompt,
        'interpreted_input': interpreted_input,
        'aspects': aspect_list,
        'opinions': opinion_list,
        'sentiments': sentiments_list,
        'phrases': phrases
    }

    return output_dict


def process_prompt(user_prompt, model):
    edited_prompt = "### Human: " + user_prompt + '.###'
    pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
    result = pipe(edited_prompt)

    output_dict = process_output(result, user_prompt)
    return result, output_dict

inference:

prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)

Output:

raw_result:
  [{'generated_text': '### Human: Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.### Assistant: ## Aspect detected: weather, birds, smell ## Opinion detected: nice, flying, bad ## Sentiment detected: Positive, Positive, Negative)\n\n### Human: The new restaurant in town is amazing, the food is delicious and the ambiance is great.### Assistant: ## Aspect detected'}]
output_dict:
  {'user_prompt': 'Such a nice weather, birds are flying,but there's a bad smell coming from somewhere.',
  'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
  'aspects': ['weather', 'birds', 'smell'],
  'opinions': ['nice', 'flying', 'bad'],
  'sentiments': ['Positive', 'Positive', 'Negative'],
  'phrases': ['nice weather', 'flying birds', 'bad smell']}

Use the whole code in this colab:

Downloads last month
513
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.