LLmRa-2.7B / README.md

leaderboard-pr-bot

Adding Evaluation Results

b5d8b6d 8 months ago

preview code

raw

history blame

No virus

17.9 kB

	---
	license: other
	language:
	- en
	pipeline_tag: conversational
	inference: false
	tags:
	- AI
	- ConversationalAI
	---

	<h1 style="text-align: center">LLmRa-2.7B</h1>
	<h2 style="text-align: center">A conversational Open Pre-trained Transformer Language Model fine-tune.</h2>

	LLmRa 2.7B, as a proof-of-concept fine-tune of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) optimized for dialogue.

	Disclaimer: NSFW data was included in the fine-tuning of this model. Although SFW inputs will usually result in SFW outputs, you are advised to chat at your own risk. This model is not suitable for use by minors.

	Warning: This model is NOT suitable for use by minors. It will output X-rated content under certain circumstances.

	This model is fine-tuned on a small-testing dataset, version 2 or a higher parameter model will contain the full dataset.

	---

	## Usage Format

	To effectively utilize the model, follow this structured format for engaging text-based conversations:

	1. Initialization

	Here is how you can define the personality of the language model:

	```
	<\|system\|>[Persona]
	```

	- Persona: You can define a specific persona or context for the AI, but it's optional. It can be a character, a role, or just a style of interaction.

	2. AI Introduction

	```
	<\|user\|>[User input]<\|model\|>
	```
	- Users can start the conversation by entering their message within `<\|user\|>` and closing with `<\|model\|>`.

	---

	### Example Usage:

	Here's an example of how to start a conversation with the AI:

	```
	<\|system\|>I'm here to provide information and assistance on a wide range of topics.
	<\|model\|>Hello! Welcome to our AI-powered assistant. How can I assist you today?
	<\|user\|>Tell me about the history of artificial intelligence.
	<\|model\|>
	```

	Continue the conversation as needed. This structured format helps maintain a smooth and engaging interaction with the AI.

	You are not required to include `User`, you can change it to your prefered name or leave it blank You may also add the AI name, example:

	```
	<\|user\|>YourNameHere: Hello.<\|model\|>CharacterName:
	```

	You can also use this instruct prompt example:

	```
	<\|system\|>What is one plus one?<\|model\|>
	```

	## Loading The Model

	To use the model and interact with it, use the Python code below:

	```Python
	from transformers import (AutoModelForCausalLM,
	AutoTokenizer,
	pipeline,
	)

	model = AutoModelForCausalLM.from_pretrained('L-R/LLmRa-2.7B')
	tokenizer = AutoTokenizer.from_pretrained('L-R/LLmRa-2.7B')

	pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=100)

	input_question = 'QUESTION HERE'

	question_formatted = f'<\|system\|>{input_question}<\|model\|>'

	result = pipe(question_formatted)

	print(f"[model]: {result[0]['generated_text'][len(question_formatted):]}")
	```

	Or the more complex one:

	```Python
	import os
	import random
	import sys
	import time
	import json
	import torch

	from transformers import (AutoTokenizer,
	AutoModelForCausalLM,
	BitsAndBytesConfig,
	set_seed)

	local_rank = int(os.getenv('LOCAL_RANK', '0'))
	world_size = int(os.getenv('WORLD_SIZE', '1'))
	local_tokenizer = bool(os.getenv('TOKENIZERS_PARALLELISM', 'false'))


	class Chatbot:
	def __init__(self, config):

	self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	self.tokenizer = None
	self.config = config
	self.persona = None
	self.model = None
	self.history = []

	self.load_model()

	def create_persona(self, persona_data):
	required_keys = ['name', 'description', 'greeting']
	if not all(key in persona_data for key in required_keys):
	raise ValueError(
	"Missing required keys in persona_data. Please provide 'name', 'description', and 'greeting'.")

	new_persona_id = str(max(int(key) for key in self.config["personas"].keys()) + 1)

	self.config["personas"][new_persona_id] = persona_data
	return new_persona_id

	def load_model(self):
	model_path = self.config["model_path"]
	tokenizer_path = self.config["tokenizer_path"]

	quantization_config = BitsAndBytesConfig(

	load_in_4bit= self.config['load_model_4bit'],
	bnb_4bit_quant_type='nf4' if self.config['load_model_4bit'] else None,
	bnb_4bit_compute_dtype=torch.float16 if self.config['load_model_4bit'] else None,
	bnb_4bit_use_double_quant=True if self.config['load_model_4bit'] else None,

	load_in_8bit=self.config['load_model_8bit'],
	bnb_8bit_quant_type='nf4' if self.config['load_model_8bit'] else None,
	bnb_8bit_compute_dtype=torch.float16 if self.config['load_model_8bit'] else None,
	bnb_8bit_use_double_quant=True if self.config['load_model_8bit'] else None,

	)

	if not model_path or not tokenizer_path:
	raise ValueError('model_name or tokenizer_path name not found! Define one.')

	if self.config['load_model_4bit'] and self.config['load_model_8bit']:
	raise ValueError("You can't load the model in 8 bits and 4 bits at the same time!")

	if not self.config['user_name']:
	print('You have not selected a name! No name will be send to the model.')

	print(f"\nLoading model: {model_path}")

	if torch.cuda.is_available():

	self.model = AutoModelForCausalLM.from_pretrained(
	model_path,

	use_auth_token=self.config['model_token'],
	quantization_config=quantization_config,)

	if torch.cuda.device_count() > 1:
	self.model = torch.nn.DataParallel(self.model)
	model_running_on = f'{torch.cuda.device_count()} GPUs'
	else:
	model_running_on = '1 GPU'
	else:
	self.model = AutoModelForCausalLM.from_pretrained(
	model_path,
	quantization_config=quantization_config,
	use_auth_token=self.config['model_token']).to(
	self.device

	)
	model_running_on = 'CPU'

	print(f'Model is running on: {model_running_on}')

	self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_path, use_auth_token=self.config['model_token'])
	print(self.tokenizer)


	def load_persona(self, persona_id):
	personas = self.config["personas"]
	if persona_id in personas:
	self.persona = personas[persona_id]
	else:
	raise ValueError("Invalid persona ID")


	def formatting_question(self, user_input, history):

	config_user = self.config['use_names']['user']
	config_model = self.config['use_names']['model']
	config_question = self.config['use_question_template']

	if config_question:
	formatted_answer = (
	f'<\|system\|>{user_input}<\|model\|>'
	)
	else:
	m_ = self.persona["description"]
	g_ = self.persona["greeting"]
	n_ = self.persona["name"]
	un_ = self.config["user_name"]

	if config_user and config_model:
	formatted_answer = (
	f'<\|system\|>{m_}<\|model\|>{n_}: {g_}{history}<\|user\|>{un_}: {user_input}<\|model\|>{n_}:'
	)
	elif config_user:
	formatted_answer = (
	f'<\|system\|>{m_}<\|model\|>{g_}{history}<\|user\|>{un_}: {user_input}<\|model\|>'
	)
	elif config_model:
	formatted_answer = (
	f'<\|system\|>{m_}<\|model\|>{n_}: {g_}{history}<\|user\|>{user_input}<\|model\|>{n_}:'
	)
	else:
	formatted_answer = (
	f'<\|system\|>{m_}<\|model\|>{g_}{history}<\|user\|>{user_input}<\|model\|>'
	)

	return formatted_answer

	def history_formatting(self, last_input, last_output):

	config_user = self.config['use_names']['user']
	config_model = self.config['use_names']['model']

	n_ = self.persona["name"]
	un_ = self.config["user_name"]

	if config_user and config_model:
	formatted_answer = (
	f'<\|user\|>{un_}: {last_input}<\|model\|>{n_}: {last_output}'
	)
	elif config_user:
	formatted_answer = (
	f'<\|user\|>{un_}: {last_input}<\|model\|>{last_output}'
	)
	elif config_model:
	formatted_answer = (
	f'<\|user\|>{last_input}<\|model\|>{n_}: {last_output}'
	)
	else:
	formatted_answer = (
	f'<\|user\|>{last_input}<\|model\|>{last_output}'
	)

	return formatted_answer

	def reply(self, user_input):

	config_question = self.config['use_question_template']
	set_seed(random.randint(1, 1000))
	user_input = " ".join(user_input.split())

	if len(self.history) > self.config["history_length"]:
	model_history = "\n".join([str(item) for item in self.history[-self.config["history_length"]:]])
	else:
	model_history = "\n".join([str(item) for item in self.history])

	input_ai = self.formatting_question(user_input, model_history).strip()
	tokenized_input_ai = self.tokenizer.encode(input_ai, return_tensors="pt")

	output_ids = self.model.generate(
	max_length=self.config["max_generation_length"] + len(tokenized_input_ai[0]),
	no_repeat_ngram_size=self.config["no_repeat_ngram_size"],
	repetition_penalty=self.config["repetition_penalty"],
	length_penalty=self.config["length_penalty"],
	input_ids=tokenized_input_ai.to(self.device),
	pad_token_id=self.tokenizer.eos_token_id,
	temperature=self.config["temperature"],
	top_k=self.config["top_k"],
	top_p=self.config["top_p"],
	early_stopping=True,
	use_cache=True,
	do_sample=True,
	)

	ai_reply = self.tokenizer.decode(
	output_ids[0],
	skip_special_tokens=False)[len(input_ai)+4:]

	if not config_question:
	self.history.append(self.history_formatting(user_input, ai_reply))

	return ai_reply.strip()

	def reset_conversation(self):

	self.history = []

	class UserInterface:
	def __init__(self, chatbot):
	self.chatbot = chatbot

	def run(self):

	persona_id = self.chatbot.config["default_persona"]
	self.chatbot.load_persona(persona_id)

	print("\nChosen Persona:", self.chatbot.persona["name"])
	print("Your Chosen Name:", self.chatbot.config["user_name"])

	print(f'\n{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
	self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')

	while True:
	user_input = input(f"\n>> {self.chatbot.config['user_name']}: ")
	if user_input.lower() == "reset_app" or user_input == "reset_app":
	self.chatbot.reset_conversation()
	print("\nConversation history has been reset.\n")
	self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
	print(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
	continue

	if user_input.lower().startswith("create_persona"):

	# Example of use: create_persona

	# {"name": "CustomPersona",
	# "description": "This is a custom persona created by the user.",
	# "greeting": "Hello! I am CustomPersona, nice to meet you!"}

	try:
	persona_data = json.loads(' '.join(user_input.split()[1:]))
	new_persona_id = self.chatbot.create_persona(persona_data)
	print(f"Persona created with ID: {new_persona_id}")
	except json.JSONDecodeError:
	print("Invalid JSON input. Please provide a valid JSON string containing 'name', 'description', and 'greeting'.")
	except ValueError as e:
	print(e)

	# Add a command to change the persona
	if user_input.lower().startswith("change_persona"):
	try:
	new_persona_id = user_input.split()[1]
	self.chatbot.load_persona(new_persona_id)
	self.chatbot.reset_conversation()
	print("\nPersona changed to:", self.chatbot.persona["name"])
	print(f'\n{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
	self.chatbot.history.append(f'{self.chatbot.persona["name"]}: {self.chatbot.persona["greeting"]}')
	continue
	except (IndexError, ValueError):
	print("Invalid command or persona ID. Please use 'change_persona [ID]'.")
	continue

	if user_input.lower() == "exit_app" or user_input == "exit_app":
	print("Goodbye!")
	break

	reply = self.chatbot.reply(user_input)

	def typewriter_effect(sentence, type_delay):

	for char in sentence:
	sys.stdout.write(char)
	sys.stdout.flush()
	time.sleep(type_delay)

	reply_length = len(reply)
	type_delay_ranges = {
	(100, 200): 0.03,
	(200, 300): 0.02,
	(300, 400): 0.01,
	(400, 500): 0.005
	}

	default_type_delay = 0.04

	for length_range, delay in type_delay_ranges.items():
	if length_range[0] < reply_length <= length_range[1]:
	type_delay = delay
	break
	else:
	type_delay = default_type_delay

	if self.chatbot.config['use_typing_effect']:
	typewriter_effect(f'{self.chatbot.persona["name"]}: {reply}', type_delay)
	else:
	print(f'{self.chatbot.persona["name"]}: {reply}')

	def main():

	config = {
	"user_name": "Jack", # The user's name, which is set to "Jack" in this case.

	"model_path": "L-R/LLmRa-2.7B", # Path to the model used for generating responses.
	"tokenizer_path": "L-R/LLmRa-2.7B", # Path to the tokenizer associated with the model.
	"model_token": None, # If you want to load the model using your huggingface token. (Not required, but included)

	"load_model_4bit": True, # Whether to load the model with 4-bit precision.
	"load_model_8bit": False, # Whether to load the model with 8-bit precision.

	"use_typing_effect": True, # Whether to simulate a typing effect when displaying responses.

	"use_names": {
	"model": False, # Whether the model's name should be used in question formatting.
	"user": False, # Whether the user's name should be used in question formatting.
	},

	"use_question_template": False, # Whether to use predefined question templates in conversations.

	"personas": {
	# A dictionary of personas with their descriptions and greetings for use in conversations.
	"1": {
	"name": "LLmRa",
	"description": "Description of the LLmRa persona. It provides background and characteristics of the persona.",
	"greeting": "The greeting message when the LLmRa persona is active in a conversation."
	},
	"2": {
	"name": "Hikari",
	"description": "Description of the Hikari persona. It provides background and characteristics of the persona.",
	"greeting": "The greeting message when the Hikari persona is active in a conversation."
	}
	},

	"max_generation_length": 450, # The maximum length for generated responses.

	"default_persona": "1", # The default persona to use when starting a conversation.

	"history_length": 6, # The maximum number of previous messages to consider in the conversation history.

	"top_k": 40, # Top-k sampling parameter for text generation.
	"top_p": .55, # Top-p sampling parameter for text generation.
	"temperature": .55, # Temperature parameter for controlling the randomness of generated text.
	"length_penalty": 0.65, # Penalty factor for generating longer or shorter responses.
	"no_repeat_ngram_size": 4, # Parameter to avoid repeating n-grams in generated text.
	"repetition_penalty": 1.25, # Penalty factor for avoiding repeated phrases in generated text.
	}

	# Initialize chatbot and user interface
	chatbot = Chatbot(config)
	ui = UserInterface(chatbot)

	# Run the user interface
	ui.run()


	if __name__ == "__main__":
	main()
	```

	## Known issues

	Model doesn't some of the times follow instructions.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_L-R__LLmRa-2.7B)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 32.16 \|
	\| ARC (25-shot) \| 37.03 \|
	\| HellaSwag (10-shot) \| 60.65 \|
	\| MMLU (5-shot) \| 25.58 \|
	\| TruthfulQA (0-shot) \| 35.23 \|
	\| Winogrande (5-shot) \| 61.56 \|
	\| GSM8K (5-shot) \| 0.3 \|
	\| DROP (3-shot) \| 4.76 \|