Model Card: Llama-2-chat-finetuned
Model Details
- Model Name: Llama-2-chat-finetuned
- Base Model: NousResearch/Llama-2-7b-chat-hf
- Fine-Tuned By: HiTruong
- Fine-Tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: Movie-related dataset
- Evaluation Metric: BLEU Score
- BLEU Score Before Fine-Tuning: 33.26
- BLEU Score After Fine-Tuning: 77.53
Model Description
This model is a fine-tuned version of NousResearch/Llama-2-7b-chat-hf, optimized for movie-related conversations. The fine-tuning process was performed using LoRA to efficiently adapt the model while keeping computational requirements manageable. It is designed to improve conversational understanding and response generation for movie-related queries.
Training Details
- Hardware Used: Kaggle GPU (T4x2)
- Fine-Tuning Framework: Hugging Face Transformers + LoRA
- Output Folder:
./results
- Number of Epochs: 2
- Batch Size:
- Per Device Train:
4
- Per Device Eval:
4
- Per Device Train:
- Gradient Accumulation Steps:
1
- Gradient Checkpointing: Enabled
- Max Gradient Norm:
0.3
- Mixed Precision:
fp16=False
,bf16=False
- Optimizer:
paged_adamw_32bit
- Learning Rate:
2e-5
- Weight Decay:
0.001
- LR Scheduler Type:
cosine
- Warmup Ratio:
0.03
- Max Steps:
-1
(determined by epochs) - Quantization Settings:
use_4bit = True
bnb_4bit_compute_dtype = float16
bnb_4bit_quant_type = nf4
use_nested_quant = False
- LoRA Hyperparameters:
lora_r = 64
lora_alpha = 16
lora_dropout = 0.05
- Sequence Length: Dynamic (
max_seq_length=None
) - Packing: Disabled (
packing=False
) - Device Map:
{"": 0}
Capabilities
- Answers movie-related questions with improved accuracy.
- Understands movie genres, actors, directors, and plots.
- Provides recommendations based on user preferences.
Limitations
- May generate incorrect or biased information.
- Limited to the knowledge present in the training dataset.
- Does not have real-time access to new movie releases.
Usage
You can load and use the model with the following code:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "HiTruong/Llama-2-chat-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
def generate_answer(question):
inputs = tokenizer(f"<s>[INST] {question} [/INST]", return_tensors="pt", truncation=True, max_length=100).to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_length=75, eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
return response.replace(f"[INST] {question} [/INST]", "").strip().split('.')[0]
input_text = "What are some great sci-fi movies?"
print(generate_answer(input_text))
- Downloads last month
- 46
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for HiTruong/Llama-2-chat-finetuned
Base model
NousResearch/Llama-2-7b-chat-hf