Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Model Card
|
2 |
+
|
3 |
+
## Model Details
|
4 |
+
|
5 |
+
- **Model Name:** Alpaca69B/llama-2-7b-absa-semeval-2016
|
6 |
+
- **Base Model:** NousResearch/Llama-2-7b-chat-hf
|
7 |
+
- **Fine-Tuned On:** Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled-train
|
8 |
+
- **Fine-Tuning Techniques:** LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc.
|
9 |
+
- **Training Resources:** Low resource usage
|
10 |
+
|
11 |
+
## Model Description
|
12 |
+
|
13 |
+
This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset.
|
14 |
+
|
15 |
+
## Fine-Tuning Techniques
|
16 |
+
|
17 |
+
### LoRA Attention
|
18 |
+
- LoRA attention dimension: 64
|
19 |
+
- Alpha parameter for LoRA scaling: 16
|
20 |
+
- Dropout probability for LoRA layers: 0.1
|
21 |
+
|
22 |
+
### bitsandbytes (4-bit precision)
|
23 |
+
- Activated 4-bit precision base model loading
|
24 |
+
- Compute dtype for 4-bit base models: "float16"
|
25 |
+
- Quantization type: "nf4"
|
26 |
+
- Nested quantization for 4-bit base models: Disabled
|
27 |
+
|
28 |
+
### TrainingArguments
|
29 |
+
- Output directory: "./results"
|
30 |
+
- Number of training epochs: 1
|
31 |
+
- Enabled fp16/bf16 training: False
|
32 |
+
- Batch size per GPU for training: 4
|
33 |
+
- Batch size per GPU for evaluation: 4
|
34 |
+
- Gradient accumulation steps: 1
|
35 |
+
- Enabled gradient checkpointing: True
|
36 |
+
- Maximum gradient norm (gradient clipping): 0.3
|
37 |
+
- Initial learning rate: 2e-4
|
38 |
+
- Weight decay: 0.001
|
39 |
+
- Optimizer: paged_adamw_32bit
|
40 |
+
- Learning rate scheduler: cosine
|
41 |
+
- Maximum training steps: -1 (overrides num_train_epochs)
|
42 |
+
- Ratio of steps for linear warmup: 0.03
|
43 |
+
- Group sequences into batches with the same length: True
|
44 |
+
- Save checkpoint every X update steps: 0 (disabled)
|
45 |
+
- Log every X update steps: 25
|
46 |
+
|
47 |
+
### SFT (Sequence-level Fine-Tuning)
|
48 |
+
- Maximum sequence length: Not specified
|
49 |
+
- Packing multiple short examples in the same input sequence: False
|
50 |
+
- Load the entire model on GPU 0
|
51 |
+
|
52 |
+
## Evaluation
|
53 |
+
|
54 |
+
The model's performance and usage can be observed in the provided [Google Colab notebook](https://colab.research.google.com/drive/1ArLpQFfXJiHcAT3VuYZndDqvskM6SkOM?usp=sharing).
|
55 |
+
|
56 |
+
## Model Usage
|
57 |
+
|
58 |
+
To use the model, follow the provided code snippet:
|
59 |
+
|
60 |
+
```python
|
61 |
+
from transformers import AutoTokenizer
|
62 |
+
import transformers
|
63 |
+
import torch
|
64 |
+
|
65 |
+
model = "Alpaca69B/llama-2-7b-absa-semeval-2016"
|
66 |
+
tokenizer = AutoTokenizer.from_pretrained(model)
|
67 |
+
pipeline = transformers.pipeline(
|
68 |
+
"text-generation",
|
69 |
+
model=model,
|
70 |
+
torch_dtype=torch.float16,
|
71 |
+
device_map="auto",
|
72 |
+
)
|
73 |
+
|
74 |
+
def process_user_prompt(input_sentence):
|
75 |
+
sequences = pipeline(
|
76 |
+
f'### Human: {input_sentence} ### Assistant: aspect: ',
|
77 |
+
do_sample=True,
|
78 |
+
top_k=10,
|
79 |
+
num_return_sequences=1,
|
80 |
+
eos_token_id=tokenizer.eos_token_id,
|
81 |
+
max_length=200,
|
82 |
+
)
|
83 |
+
result_dict = process_output(sequences[0]['generated_text'])
|
84 |
+
return result_dict
|
85 |
+
|
86 |
+
def process_output(output):
|
87 |
+
# ... (provided code for processing output)
|
88 |
+
|
89 |
+
output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.')
|
90 |
+
print(output)
|
91 |
+
```
|
92 |
+
|
93 |
+
## Fine-Tuning Details
|
94 |
+
|
95 |
+
Details of the fine-tuning process are available in the [fine-tuning Colab notebook](https://colab.research.google.com/drive/1PQfBsDyM8TSSBchL6PPA4o6rOyLFLUnu?usp=sharing).
|
96 |
+
|
97 |
+
|
98 |
+
**Note:** Ensure that you have the necessary dependencies and resources before running the model.
|