|
# Model Card for vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF |
|
|
|
This model is a fine-tuned version of Llama-2-Chat-7b on company-specific question-answers data. It is designed for efficient performance while maintaining high-quality output, suitable for conversational AI applications. |
|
|
|
## Model Details |
|
It was finetuned using QLORA and PEFT. After fine-tuning, the adapters were merged with the base model and then quantized to GGUF. |
|
- **Developed by:** Vishan Oberoi and Dev Chandan. |
|
- **Model type:** Transformer-based Large Language Model |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model:** https://huggingface.co/meta-llama/Llama-2-7b-chat-hf |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF](https://huggingface.co/vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF) |
|
- **Links:** |
|
- LLaMA: [LLaMA Paper](https://arxiv.org/abs/2302.13971) |
|
- QLORA: [QLORA Paper](https://arxiv.org/abs/2305.14314) |
|
- llama.cpp: [llama.cpp Paper/Documentation](https://github.com/ggerganov/llama.cpp) |
|
|
|
## Uses |
|
|
|
|
|
This model is optimized for direct use in conversational AI, particularly for generating responses based on company-specific data. It can be utilized effectively in customer service bots, FAQ bots, and other applications where accurate and contextually relevant answers are required. |
|
## Usage notebook |
|
https://colab.research.google.com/drive/1885wYoXeRjVjJzHqL9YXJr5ZjUQOSI-w?authuser=4#scrollTo=TZIoajzYYkrg |
|
|
|
#### Example with `ctransformers`: |
|
|
|
```python |
|
from ctransformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
llm = AutoModelForCausalLM.from_pretrained("vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF", model_file="finetuned.gguf", model_type="llama", gpu_layers = 50, max_new_tokens = 2000, temperature = 0.2, top_k = 40, top_p = 0.6, context_length = 6000) |
|
|
|
system_prompt = '''<<SYS>> |
|
You are a useful bot |
|
<</SYS>> |
|
|
|
''' |
|
|
|
user_prompt = "Tell me about your company" |
|
|
|
# Combine system prompt with user prompt |
|
full_prompt = f"{system_prompt}\n[INST]{user_prompt}[/INST]" |
|
|
|
# Generate the response |
|
response = llm(full_prompt) |
|
|
|
# Print the response |
|
print(response) |
|
|
|
|
|
|