vaishnavkoka commited on
Commit
85b65d3
1 Parent(s): 527f37d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -17,3 +17,54 @@ tags:
17
  - tuned
18
  - sst
19
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  - tuned
18
  - sst
19
  ---
20
+
21
+ 1. Overview
22
+ This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2 dataset. The objective is to perform binary sentiment classification, identifying whether a given sentence expresses a positive or negative sentiment. The fine-tuning process focuses on task-specific optimization, transforming the pre-trained Llama model into a powerful sentiment analysis tool.
23
+
24
+ 2. Model Information
25
+ Model Used: meta-llama/Llama-3.2-1B
26
+ Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
27
+ Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
28
+ 3. Dataset and Task Details
29
+ Dataset: SST-2
30
+ The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
31
+ The dataset consists of sentences labeled as either positive or negative sentiment.
32
+ Task Objective
33
+ Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
34
+ 4. Fine-Tuning Approach
35
+ Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
36
+ Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
37
+ Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
38
+ Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
39
+ 5. Results and Observations
40
+ Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
41
+
42
+ Fine-tuning Benefits: Task-specific training allowed the model to better understand contextual nuances in the data, resulting in enhanced sentiment classification capabilities.
43
+
44
+ Model Parameters: The total number of parameters did not change during fine-tuning, indicating that the improvement in performance is attributed solely to the updated weights.
45
+
46
+ 6. How to Use the Fine-Tuned Model
47
+ Install Necessary Libraries:
48
+
49
+ pip install transformers datasets
50
+ Load the Fine-Tuned Model:
51
+
52
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
53
+
54
+ model_name = "<your-huggingface-repo>/sst2-llama-finetuned"
55
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
56
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
57
+ Make Predictions:
58
+
59
+ text = "The movie was absolutely fantastic!"
60
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
61
+ outputs = model(**inputs)
62
+ sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
63
+ print(f"Predicted Sentiment: {sentiment}")
64
+ 7. Key Takeaways
65
+ Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
66
+ The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
67
+ This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
68
+ 8. Acknowledgments
69
+ Hugging Face Transformers library for facilitating model fine-tuning.
70
+ Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.