Edit model card

llama-2-7b-absa-semeval-2016

Model Details

  • Model Name: Alpaca69B/llama-2-7b-absa-semeval-2016
  • Base Model: NousResearch/Llama-2-7b-chat-hf
  • Fine-Tuned On: Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled
  • Fine-Tuning Techniques: LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc.
  • Training Resources: Low resource usage

Model Description

This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset.

Fine-Tuning Techniques

LoRA Attention

  • LoRA attention dimension: 64
  • Alpha parameter for LoRA scaling: 16
  • Dropout probability for LoRA layers: 0.1

bitsandbytes (4-bit precision)

  • Activated 4-bit precision base model loading
  • Compute dtype for 4-bit base models: "float16"
  • Quantization type: "nf4"
  • Nested quantization for 4-bit base models: Disabled

TrainingArguments

  • Output directory: "./results"
  • Number of training epochs: 2
  • Enabled fp16/bf16 training: False
  • Batch size per GPU for training: 4
  • Batch size per GPU for evaluation: 4
  • Gradient accumulation steps: 1
  • Enabled gradient checkpointing: True
  • Maximum gradient norm (gradient clipping): 0.3
  • Initial learning rate: 2e-4
  • Weight decay: 0.001
  • Optimizer: paged_adamw_32bit
  • Learning rate scheduler: cosine
  • Maximum training steps: -1 (overrides num_train_epochs)
  • Ratio of steps for linear warmup: 0.03
  • Group sequences into batches with the same length: True
  • Save checkpoint every X update steps: 0 (disabled)
  • Log every X update steps: 100

SFT (Sequence-level Fine-Tuning)

  • Maximum sequence length: Not specified
  • Packing multiple short examples in the same input sequence: False
  • Load the entire model on GPU 0

Evaluation

The model's performance and usage can be observed in the provided Google Colab notebook.

Model Usage

To use the model, follow the provided code snippet:

from transformers import AutoTokenizer
import transformers
import torch

model = "Alpaca69B/llama-2-7b-absa-semeval-2016"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

def process_user_prompt(input_sentence):
    sequences = pipeline(
        f'### Human: {input_sentence} ### Assistant: aspect: ',
        do_sample=True,
        top_k=10,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        max_length=200,
    )
    result_dict = process_output(sequences[0]['generated_text'])
    return result_dict

def process_output(output):
    result_dict = {}

   # Extract user_prompt
    user_prompt_start = output.find("### Human:")
    user_prompt_end = output.find("aspect: ") + len("aspect: ")
    result_dict['user_prompt'] = output[user_prompt_start:user_prompt_end].strip()

    # Extract cleared_generated_output
    cleared_output_end = output.find(")")
    result_dict['cleared_generated_output'] = output[:cleared_output_end+1].strip()

    # Extract review
    human_start = output.find("Human:") + len("Human:")
    assistant_start = output.find("### Assistant:")
    result_dict['review'] = output[human_start:assistant_start].strip()

    # Extract aspect and sentiment
    aspect_start = output.find("aspect: ") + len("aspect: ")
    sentiment_start = output.find("sentiment: ")
    aspect_text = output[aspect_start:sentiment_start].strip()
    result_dict['aspect'] = aspect_text

    sentiment_end = output[sentiment_start:].find(")") + sentiment_start
    sentiment_text = output[sentiment_start+len("sentiment:"):sentiment_end].strip()
    result_dict['sentiment'] = sentiment_text

    return result_dict


output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.')
print(output)

Fine-Tuning Details

Details of the fine-tuning process are available in the fine-tuning Colab notebook.

Note: Ensure that you have the necessary dependencies and resources before running the model.

Downloads last month
3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Alpaca69B/llama-2-7b-absa-semeval-2016-2epoches