vaishnavkoka
commited on
Commit
•
85b65d3
1
Parent(s):
527f37d
Update README.md
Browse files
README.md
CHANGED
@@ -17,3 +17,54 @@ tags:
|
|
17 |
- tuned
|
18 |
- sst
|
19 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
- tuned
|
18 |
- sst
|
19 |
---
|
20 |
+
|
21 |
+
1. Overview
|
22 |
+
This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2 dataset. The objective is to perform binary sentiment classification, identifying whether a given sentence expresses a positive or negative sentiment. The fine-tuning process focuses on task-specific optimization, transforming the pre-trained Llama model into a powerful sentiment analysis tool.
|
23 |
+
|
24 |
+
2. Model Information
|
25 |
+
Model Used: meta-llama/Llama-3.2-1B
|
26 |
+
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
|
27 |
+
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
|
28 |
+
3. Dataset and Task Details
|
29 |
+
Dataset: SST-2
|
30 |
+
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
|
31 |
+
The dataset consists of sentences labeled as either positive or negative sentiment.
|
32 |
+
Task Objective
|
33 |
+
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
|
34 |
+
4. Fine-Tuning Approach
|
35 |
+
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
|
36 |
+
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
|
37 |
+
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
|
38 |
+
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
|
39 |
+
5. Results and Observations
|
40 |
+
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
|
41 |
+
|
42 |
+
Fine-tuning Benefits: Task-specific training allowed the model to better understand contextual nuances in the data, resulting in enhanced sentiment classification capabilities.
|
43 |
+
|
44 |
+
Model Parameters: The total number of parameters did not change during fine-tuning, indicating that the improvement in performance is attributed solely to the updated weights.
|
45 |
+
|
46 |
+
6. How to Use the Fine-Tuned Model
|
47 |
+
Install Necessary Libraries:
|
48 |
+
|
49 |
+
pip install transformers datasets
|
50 |
+
Load the Fine-Tuned Model:
|
51 |
+
|
52 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
53 |
+
|
54 |
+
model_name = "<your-huggingface-repo>/sst2-llama-finetuned"
|
55 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
56 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name)
|
57 |
+
Make Predictions:
|
58 |
+
|
59 |
+
text = "The movie was absolutely fantastic!"
|
60 |
+
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
|
61 |
+
outputs = model(**inputs)
|
62 |
+
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
|
63 |
+
print(f"Predicted Sentiment: {sentiment}")
|
64 |
+
7. Key Takeaways
|
65 |
+
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
|
66 |
+
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
|
67 |
+
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
|
68 |
+
8. Acknowledgments
|
69 |
+
Hugging Face Transformers library for facilitating model fine-tuning.
|
70 |
+
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.
|