LLmRa-2.7B / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
b5d8b6d
|
raw
history blame
No virus
17.9 kB
---
license: other
language:
- en
pipeline_tag: conversational
inference: false
tags:
- AI
- ConversationalAI
---
<h1 style="text-align: center">LLmRa-2.7B</h1>
<h2 style="text-align: center">A conversational Open Pre-trained Transformer Language Model fine-tune.</h2>
**LLmRa 2.7B**, as a proof-of-concept fine-tune of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) optimized for dialogue.
**Disclaimer:** NSFW data was included in the fine-tuning of this model. Although SFW inputs will usually result in SFW outputs, you are advised to **chat at your own risk. This model is not suitable for use by minors.**
**Warning:** This model is **NOT** suitable for use by minors. **It will output X-rated content under certain circumstances.**
**This model is fine-tuned on a small-testing dataset, version 2 or a higher parameter model will contain the full dataset.**
---
## Usage Format
To effectively utilize the model, follow this structured format for engaging text-based conversations:
**1. Initialization**
Here is how you can define the personality of the language model:
```
<|system|>[Persona]
```
- **Persona**: You can define a specific persona or context for the AI, but it's optional. It can be a character, a role, or just a style of interaction.
**2. AI Introduction**
```
<|user|>[User input]<|model|>
```
- Users can start the conversation by entering their message within `<|user|>` and closing with `<|model|>`.
---
### Example Usage:
Here's an example of how to start a conversation with the AI:
```
<|system|>I'm here to provide information and assistance on a wide range of topics.
<|model|>Hello! Welcome to our AI-powered assistant. How can I assist you today?
<|user|>Tell me about the history of artificial intelligence.
<|model|>
```
Continue the conversation as needed. This structured format helps maintain a smooth and engaging interaction with the AI.
You are not required to include `User`, you can change it to your prefered name or leave it blank You may also add the AI name, example:
```
<|user|>YourNameHere: Hello.<|model|>CharacterName:
```
You can also use this instruct prompt example:
```
<|system|>What is one plus one?<|model|>
```
## Loading The Model
To use the model and interact with it, use the Python code below:
```Python
from transformers import (AutoModelForCausalLM,
AutoTokenizer,
pipeline,
)
model = AutoModelForCausalLM.from_pretrained('L-R/LLmRa-2.7B')
tokenizer = AutoTokenizer.from_pretrained('L-R/LLmRa-2.7B')
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=100)
input_question = 'QUESTION HERE'
question_formatted = f'<|system|>{input_question}<|model|>'
result = pipe(question_formatted)
print(f"[model]: {result[0]['generated_text'][len(question_formatted):]}")
```
Or the more complex one:
```Python
import os
import random
import sys
import time
import json
import torch
from transformers import (AutoTokenizer,
AutoModelForCausalLM,
BitsAndBytesConfig,
set_seed)
local_rank = int(os.getenv('LOCAL_RANK', '0'))
world_size = int(os.getenv('WORLD_SIZE', '1'))
local_tokenizer = bool(os.getenv('TOKENIZERS_PARALLELISM', 'false'))
class Chatbot:
def __init__(self, config):
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.tokenizer = None
self.config = config
self.persona = None
self.model = None
self.history = []
self.load_model()
def create_persona(self, persona_data):
required_keys = ['name', 'description', 'greeting']
if not all(key in persona_data for key in required_keys):
raise ValueError(
"Missing required keys in persona_data. Please provide 'name', 'description', and 'greeting'.")
new_persona_id = str(max(int(key) for key in self.config["personas"].keys()) + 1)
self.config["personas"][new_persona_id] = persona_data
return new_persona_id
def load_model(self):
model_path = self.config["model_path"]
tokenizer_path = self.config["tokenizer_path"]
quantization_config = BitsAndBytesConfig(
load_in_4bit= self.config['load_model_4bit'],
bnb_4bit_quant_type='nf4' if self.config['load_model_4bit'] else None,
bnb_4bit_compute_dtype=torch.float16 if self.config['load_model_4bit'] else None,
bnb_4bit_use_double_quant=True if self.config['load_model_4bit'] else None,
load_in_8bit=self.config['load_model_8bit'],
bnb_8bit_quant_type='nf4' if self.config['load_model_8bit'] else None,
bnb_8bit_compute_dtype=torch.float16 if self.config['load_model_8bit'] else None,
bnb_8bit_use_double_quant=True if self.config['load_model_8bit'] else None,
)
if not model_path or not tokenizer_path:
raise ValueError('model_name or tokenizer_path name not found! Define one.')
if self.config['load_model_4bit'] and self.config['load_model_8bit']:
raise ValueError("You can't load the model in 8 bits and 4 bits at the same time!")
if not self.config['user_name']:
print('You have not selected a name! No name will be send to the model.')
print(f"\nLoading model: {model_path}")
if torch.cuda.is_available():
self.model = AutoModelForCausalLM.from_pretrained(
model_path,
use_auth_token=self.config['model_token'],
quantization_config=quantization_config,)
if torch.cuda.device_count() > 1:
self.model = torch.nn.DataParallel(self.model)
model_running_on = f'{torch.cuda.device_count()} GPUs'
else:
model_running_on = '1 GPU'
else:
self.model = AutoModelForCausalLM.from_pretrained(
model_path,
quantization_config=quantization_config,
use_auth_token=self.config['model_token']).to(
self.device
)
model_running_on = 'CPU'
print(f'Model is running on: {model_running_on}')
self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_path, use_auth_token=self.config['model_token'])
print(self.tokenizer)
def load_persona(self, persona_id):
personas = self.config["personas"]
if persona_id in personas:
self.persona = personas[persona_id]
else:
raise ValueError("Invalid persona ID")
def formatting_question(self, user_input, history):
config_user = self.config['use_names']['user']
config_model = self.config['use_names']['model']
config_question = self.config['use_question_template']
if config_question:
formatted_answer = (
f'<|system|>{user_input}<|model|>'
)
else:
m_ = self.persona["description"]
g_ = self.persona["greeting"]
n_ = self.persona["name"]
un_ = self.config["user_name"]
if config_user and config_model:
formatted_answer = (
f'<|system|>{m_}<|model|>{n_}: {g_}{history}<|user|>{un_}: {user_input}<|model|>{n_}:'
)
elif config_user:
formatted_answer = (
f'<|system|>{m_}<|model|>{g_}{history}<|user|>{un_}: {user_input}<|model|>'
)
elif config_model:
formatted_answer = (
f'<|system|>{m_}<|model|>{n_}: {g_}{history}<|user|>{user_input}<|model|>{n_}:'
)
else:
formatted_answer = (
f'<|system|>{m_}<|model|>{g_}{history}<|user|>{user_input}<|model|>'
)
return formatted_answer
def history_formatting(self, last_input, last_output):
config_user = self.config['use_names']['user']
config_model = self.config['use_names']['model']
n_ = self.persona["name"]
un_ = self.config["user_name"]
if config_user and config_model:
formatted_answer = (
f'<|user|>{un_}: {last_input}<|model|>{n_}: {last_output}'
)
elif config_user:
formatted_answer = (
f'<|user|>{un_}: {last_input}<|model|>{last_output}'
)
elif config_model:
formatted_answer = (
f'<|user|>{last_input}<|model|>{n_}: {last_output}'
)
else:
formatted_answer = (
f'<|user|>{last_input}<|model|>{last_output}'
)
return formatted_answer
def reply(self, user_input):
config_question = self.config['use_question_template']
set_seed(random.randint(1, 1000))
user_input = " ".join(user_input.split())
if len(self.history) > self.config["history_length"]:
model_history = "\n".join([str(item) for item in self.history[-self.config["history_length"]:]])
else:
model_history = "\n".join([str(item) for item in self.history])
input_ai = self.formatting_question(user_input, model_history).strip()
tokenized_input_ai = self.tokenizer.encode(input_ai, return_tensors="pt")
output_ids = self.model.generate(
max_length=self.config["max_generation_length"] + len(tokenized_input_ai[0]),
no_repeat_ngram_size=self.config["no_repeat_ngram_size"],
repetition_penalty=self.config["repetition_penalty"],
length_penalty=self.config["length_penalty"],
input_ids=tokenized_input_ai.to(self.device),
pad_token_id=self.tokenizer.eos_token_id,
temperature=self.config["temperature"],
top_k=self.config["top_k"],
top_p=self.config["top_p"],
early_stopping=True,
use_cache=True,
do_sample=True,
)
ai_reply = self.tokenizer.decode(
output_ids[0],
skip_special_tokens=False)[len(input_ai)+4:]
if not config_question:
self.history.append(self.history_formatting(user_input, ai_reply))
return ai_reply.strip()
def reset_conversation(self):
self.history = []
class UserInterface:
def __init__(self, chatbot):
self.chatbot = chatbot
def run(self):
persona_id = self.chatbot.config["default_persona"]
self.chatbot.load_persona(persona_id)
print("\nChosen Persona:", self.chatbot.persona["name"])
print("Your Chosen Name:", self.chatbot.config["user_name"])
print(f'\n{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
while True:
user_input = input(f"\n>> {self.chatbot.config['user_name']}: ")
if user_input.lower() == "reset_app" or user_input == "reset_app":
self.chatbot.reset_conversation()
print("\nConversation history has been reset.\n")
self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
print(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
continue
if user_input.lower().startswith("create_persona"):
# Example of use: create_persona
# {"name": "CustomPersona",
# "description": "This is a custom persona created by the user.",
# "greeting": "Hello! I am CustomPersona, nice to meet you!"}
try:
persona_data = json.loads(' '.join(user_input.split()[1:]))
new_persona_id = self.chatbot.create_persona(persona_data)
print(f"Persona created with ID: {new_persona_id}")
except json.JSONDecodeError:
print("Invalid JSON input. Please provide a valid JSON string containing 'name', 'description', and 'greeting'.")
except ValueError as e:
print(e)
# Add a command to change the persona
if user_input.lower().startswith("change_persona"):
try:
new_persona_id = user_input.split()[1]
self.chatbot.load_persona(new_persona_id)
self.chatbot.reset_conversation()
print("\nPersona changed to:", self.chatbot.persona["name"])
print(f'\n{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
continue
except (IndexError, ValueError):
print("Invalid command or persona ID. Please use 'change_persona [ID]'.")
continue
if user_input.lower() == "exit_app" or user_input == "exit_app":
print("Goodbye!")
break
reply = self.chatbot.reply(user_input)
def typewriter_effect(sentence, type_delay):
for char in sentence:
sys.stdout.write(char)
sys.stdout.flush()
time.sleep(type_delay)
reply_length = len(reply)
type_delay_ranges = {
(100, 200): 0.03,
(200, 300): 0.02,
(300, 400): 0.01,
(400, 500): 0.005
}
default_type_delay = 0.04
for length_range, delay in type_delay_ranges.items():
if length_range[0] < reply_length <= length_range[1]:
type_delay = delay
break
else:
type_delay = default_type_delay
if self.chatbot.config['use_typing_effect']:
typewriter_effect(f'{self.chatbot.persona["name"]}: {reply}', type_delay)
else:
print(f'{self.chatbot.persona["name"]}: {reply}')
def main():
config = {
"user_name": "Jack", # The user's name, which is set to "Jack" in this case.
"model_path": "L-R/LLmRa-2.7B", # Path to the model used for generating responses.
"tokenizer_path": "L-R/LLmRa-2.7B", # Path to the tokenizer associated with the model.
"model_token": None, # If you want to load the model using your huggingface token. (Not required, but included)
"load_model_4bit": True, # Whether to load the model with 4-bit precision.
"load_model_8bit": False, # Whether to load the model with 8-bit precision.
"use_typing_effect": True, # Whether to simulate a typing effect when displaying responses.
"use_names": {
"model": False, # Whether the model's name should be used in question formatting.
"user": False, # Whether the user's name should be used in question formatting.
},
"use_question_template": False, # Whether to use predefined question templates in conversations.
"personas": {
# A dictionary of personas with their descriptions and greetings for use in conversations.
"1": {
"name": "LLmRa",
"description": "Description of the LLmRa persona. It provides background and characteristics of the persona.",
"greeting": "The greeting message when the LLmRa persona is active in a conversation."
},
"2": {
"name": "Hikari",
"description": "Description of the Hikari persona. It provides background and characteristics of the persona.",
"greeting": "The greeting message when the Hikari persona is active in a conversation."
}
},
"max_generation_length": 450, # The maximum length for generated responses.
"default_persona": "1", # The default persona to use when starting a conversation.
"history_length": 6, # The maximum number of previous messages to consider in the conversation history.
"top_k": 40, # Top-k sampling parameter for text generation.
"top_p": .55, # Top-p sampling parameter for text generation.
"temperature": .55, # Temperature parameter for controlling the randomness of generated text.
"length_penalty": 0.65, # Penalty factor for generating longer or shorter responses.
"no_repeat_ngram_size": 4, # Parameter to avoid repeating n-grams in generated text.
"repetition_penalty": 1.25, # Penalty factor for avoiding repeated phrases in generated text.
}
# Initialize chatbot and user interface
chatbot = Chatbot(config)
ui = UserInterface(chatbot)
# Run the user interface
ui.run()
if __name__ == "__main__":
main()
```
## Known issues
Model doesn't some of the times follow instructions.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_L-R__LLmRa-2.7B)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 32.16 |
| ARC (25-shot) | 37.03 |
| HellaSwag (10-shot) | 60.65 |
| MMLU (5-shot) | 25.58 |
| TruthfulQA (0-shot) | 35.23 |
| Winogrande (5-shot) | 61.56 |
| GSM8K (5-shot) | 0.3 |
| DROP (3-shot) | 4.76 |