license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
Cat1.0
Overview
This repository provides a fine-tuned version of the Llama-3-1-8b base model, optimized for roleplaying, logic, and reasoning tasks. Utilizing iterative fine-tuning and self-generated chat logs, this model delivers engaging and coherent conversational experiences.
Model Specifications
- Parameters: 8 Billion (8B)
- Precision: bf16 (Brain Floating Point 16-bit)
- Fine-Tuning Method: LoRa (Low-Rank Adaptation)
- Datasets Used:
- Roleplay Dataset
- Reasoning and Logic Dataset
- Fine-Tuning Approach: Iterative Fine-Tuning using self-chat logs
Recommended Settings
To achieve optimal performance with this model, we recommend the following settings:
- Minimum Probability (
min_p
):0.05
- Temperature:
1.1
or higher
Note: Due to the nature of the fine-tuning, setting the temperature to
1.1
or higher helps prevent the model from repeating itself and encourages more creative and coherent responses.
Usage Instructions
We recommend using the oobabooga text-generation-webui for an optimal experience. Load the model in bf16
precision and enable flash-attention2
for improved performance.
Installation Steps
Clone the WebUI Repository:
git clone https://github.com/oobabooga/text-generation-webui cd text-generation-webui
Install Dependencies:
pip install -r requirements.txt
Download the Model:
Download the fine-tuned model from Hugging Face and place it in the
models
directory.Launch the WebUI:
python server.py --bf16 --flash-attention
Sample Prompt Formats
You can interact with the model using either chat format or chat-instruct format. Here's an example:
Ryan is a computer engineer who works at Intel.
Ryan: Hey, how's it going Natalie?
Natalie: Good, how are things going with you, Ryan?
Ryan: Great, I'm just doing just great.
Text Generation Example
Model Capabilities
Below are some examples showcasing the model's performance in various tasks:
Instruct Log Examples
Limitations and Tips
While this model excels in chat and roleplaying scenarios, it isn't perfect. If you notice the model repeating itself or providing less coherent responses:
- Increase the Temperature: Setting the temperature higher (≥
1.1
) can help generate more diverse and creative outputs. - Adjust
min_p
Setting: Ensuringmin_p
is at least0.05
can prevent low-probability tokens from being excluded, enhancing the response quality.
Acknowledgments
- oobabooga text-generation-webui: A powerful interface for running and interacting with language models. GitHub Repository
- Hugging Face: For hosting the model and providing a platform for collaboration. Website
License
[Specify the license under which the model is released, e.g., MIT License, Apache 2.0, etc.]
For any issues or questions, please open an issue in this repository.