Alpaca69B commited on
Commit
9a78c83
1 Parent(s): 13a8276

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card
2
+
3
+ ## Model Details
4
+
5
+ - **Model Name:** Alpaca69B/llama-2-7b-absa-semeval-2016
6
+ - **Base Model:** NousResearch/Llama-2-7b-chat-hf
7
+ - **Fine-Tuned On:** Alpaca69B/semeval2016-full-absa-reviews-english-translated-resampled-train
8
+ - **Fine-Tuning Techniques:** LoRA attention, 4-bit precision base model loading, gradient checkpointing, etc.
9
+ - **Training Resources:** Low resource usage
10
+
11
+ ## Model Description
12
+
13
+ This model is an aspect based sentiment analysis model fine-tuned from the Llama-2-7b-chat model on an adjusted semeval-2016 dataset.
14
+
15
+ ## Fine-Tuning Techniques
16
+
17
+ ### LoRA Attention
18
+ - LoRA attention dimension: 64
19
+ - Alpha parameter for LoRA scaling: 16
20
+ - Dropout probability for LoRA layers: 0.1
21
+
22
+ ### bitsandbytes (4-bit precision)
23
+ - Activated 4-bit precision base model loading
24
+ - Compute dtype for 4-bit base models: "float16"
25
+ - Quantization type: "nf4"
26
+ - Nested quantization for 4-bit base models: Disabled
27
+
28
+ ### TrainingArguments
29
+ - Output directory: "./results"
30
+ - Number of training epochs: 1
31
+ - Enabled fp16/bf16 training: False
32
+ - Batch size per GPU for training: 4
33
+ - Batch size per GPU for evaluation: 4
34
+ - Gradient accumulation steps: 1
35
+ - Enabled gradient checkpointing: True
36
+ - Maximum gradient norm (gradient clipping): 0.3
37
+ - Initial learning rate: 2e-4
38
+ - Weight decay: 0.001
39
+ - Optimizer: paged_adamw_32bit
40
+ - Learning rate scheduler: cosine
41
+ - Maximum training steps: -1 (overrides num_train_epochs)
42
+ - Ratio of steps for linear warmup: 0.03
43
+ - Group sequences into batches with the same length: True
44
+ - Save checkpoint every X update steps: 0 (disabled)
45
+ - Log every X update steps: 25
46
+
47
+ ### SFT (Sequence-level Fine-Tuning)
48
+ - Maximum sequence length: Not specified
49
+ - Packing multiple short examples in the same input sequence: False
50
+ - Load the entire model on GPU 0
51
+
52
+ ## Evaluation
53
+
54
+ The model's performance and usage can be observed in the provided [Google Colab notebook](https://colab.research.google.com/drive/1ArLpQFfXJiHcAT3VuYZndDqvskM6SkOM?usp=sharing).
55
+
56
+ ## Model Usage
57
+
58
+ To use the model, follow the provided code snippet:
59
+
60
+ ```python
61
+ from transformers import AutoTokenizer
62
+ import transformers
63
+ import torch
64
+
65
+ model = "Alpaca69B/llama-2-7b-absa-semeval-2016"
66
+ tokenizer = AutoTokenizer.from_pretrained(model)
67
+ pipeline = transformers.pipeline(
68
+ "text-generation",
69
+ model=model,
70
+ torch_dtype=torch.float16,
71
+ device_map="auto",
72
+ )
73
+
74
+ def process_user_prompt(input_sentence):
75
+ sequences = pipeline(
76
+ f'### Human: {input_sentence} ### Assistant: aspect: ',
77
+ do_sample=True,
78
+ top_k=10,
79
+ num_return_sequences=1,
80
+ eos_token_id=tokenizer.eos_token_id,
81
+ max_length=200,
82
+ )
83
+ result_dict = process_output(sequences[0]['generated_text'])
84
+ return result_dict
85
+
86
+ def process_output(output):
87
+ # ... (provided code for processing output)
88
+
89
+ output = process_user_prompt('the first thing that attracts attention is the warm reception and the smiling receptionists.')
90
+ print(output)
91
+ ```
92
+
93
+ ## Fine-Tuning Details
94
+
95
+ Details of the fine-tuning process are available in the [fine-tuning Colab notebook](https://colab.research.google.com/drive/1PQfBsDyM8TSSBchL6PPA4o6rOyLFLUnu?usp=sharing).
96
+
97
+
98
+ **Note:** Ensure that you have the necessary dependencies and resources before running the model.