Model Overview
- Model Name: Le-Empereur_70-Base
Model Description:
The pruned model was fine-tuned on the FineTome-100K dataset to restore partial convergence to the model.
Inference Script:
def generate_response(model_name, input_text, max_new_tokens=50):
# Load the tokenizer and model from Hugging Face Hub
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Tokenize the input text
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
# Generate a response using the model
with torch.no_grad():
generated_ids = model.generate(input_ids, max_new_tokens=max_new_tokens)
# Decode the generated tokens into text
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
return generated_text
if __name__ == "__main__":
# Set the model name from Hugging Face Hub
model_name = "AINovice2005/Le-Empereur_70-Base"
input_text = "Hello, how are you?"
# Generate and print the model's response
output = generate_response(model_name, input_text)
print(f"Input: {input_text}")
print(f"Output: {output}")
๐๐๐ฌ๐ฎ๐ฅ๐ญ๐ฌ: Firstly, a higher learning rate was required for the model to train the model on a dataset, training methods such as SFT and ORPO with PEFT failed to restore convergence.Secondly, training only using PEFT helped to restore partial model convergence.
Lastly, future experiments to restore model convergence will require a systematic training and eval strategy.
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for AINovice2005/LeEmpereur_70-Base
Base model
mistralai/Mistral-7B-v0.1
Finetuned
alignment-handbook/zephyr-7b-sft-full
Finetuned
argilla/notus-7b-v1
Finetuned
AINovice2005/LeEmpereur_70