OpenLlama-Stable-7B
This is a merge of pre-trained language models created using LazyMergekit, combining the foundational capabilities of OpenLM's Open Llama with StabilityAI's StableBeluga through an efficient SLERP fusion.
About Me
I'm David Soeiro-Vuong, a third-year Computer Science student working as an apprentice at TW3 Partners, a company specialized in Generative AI. Passionate about artificial intelligence and language models optimization, I focus on creating efficient model merges that balance performance and capabilities.
Merge Details
Merge Method
This model uses SLERP (Spherical Linear Interpolation) with carefully tuned parameters to achieve optimal performance balance:
- Attention Layers: 0.7 interpolation value favoring StableBeluga's strong instruction-following capabilities
- MLP Layers: 0.5 interpolation value creating an equal blend for balanced reasoning
- Other Parameters: 0.6 interpolation value slightly favoring StableBeluga's refinements
- Format: bfloat16 precision for efficient memory usage
Models Merged
- openlm-research/open_llama_7b - An open-source reproduction of Meta's LLaMA that offers strong base capabilities
- stabilityai/StableBeluga-7B - StabilityAI's instruction-tuned variant offering improved instruction following and coherence
Configuration
slices:
- sources:
- model: openlm-research/open_llama_7b
layer_range: [0, 32]
- model: stabilityai/StableBeluga-7B
layer_range: [0, 32]
merge_method: slerp
base_model: openlm-research/open_llama_7b
parameters:
t:
# Couches d'attention: préférence pour StableBeluga (0.7)
- filter: self_attn
value: 0.7
# Couches MLP: équilibrées
- filter: mlp
value: 0.5
# Tout le reste
- value: 0.6
dtype: bfloat16
Model Capabilities
This merge combines:
- Open Llama's strong foundational knowledge and reasoning
- StableBeluga's improved instruction following and coherence
- Fully open architecture with no usage restrictions
The resulting model provides enhanced performance on tasks requiring both strong reasoning and good instruction following, such as:
- Detailed explanations of complex concepts
- Creative writing with coherent structure
- Problem-solving with step-by-step reasoning
- Balanced factual responses with nuanced perspectives
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "david-sv/OpenLlama-Stable-7B" # Replace with your actual HF username
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
# For chat completions
prompt = """<human>: Explain the concept of spherical linear interpolation (SLERP) and why it's useful for merging language models.
<assistant>:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
inputs["input_ids"],
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.1
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Limitations
- Inherits limitations from both base models
- May exhibit inconsistent behavior for certain complex reasoning tasks
- No additional alignment or fine-tuning beyond the base models' training
- Model was created through parameter merging without additional training data
License
This model is released under the Apache 2.0 license, consistent with the underlying models' licenses.
- Downloads last month
- 16