Model Card for TrelisSmolLM-instruct
This model is a fine-tuned version of TrelisSmolLM-base, optimized for instruction following and conversational tasks using the WebInstructSub dataset.
To purchase the training scripts used for this model, visit: https://trelis.com/advanced-fine-tuning-scripts/
Model Details
Model Description
TrelisLM-80M-SFT is an 80 million parameter language model derived from SmolLM-360M through pruning and distillation, and then fine-tuned on the WebInstructSub dataset for improved instruction following capabilities.
- Developed by: Trelis AI
- Model type: Causal Language Model
- Language(s): English
- License: [More Information Needed]
- Finetuned from model: Trelis/80M-0.0090-cosmopedia
Model Sources
Uses
Direct Use
This model is designed for instruction following and conversational tasks. It can be used for:
- Generating responses to user prompts or questions
- Engaging in task-oriented dialogues
- Assisting with general language understanding and generation tasks
Out-of-Scope Use
This model should not be used for:
- Production systems without thorough testing and evaluation
- Tasks requiring domain-specific expertise without additional fine-tuning
- Any applications where errors could lead to harmful consequences
Training Details
Training Data
The model was fine-tuned on the TIGER-Lab/WebInstructSub dataset, which consists of instruction-response pairs. The training process used:
- 50,000 initial rows for the main training phase
- 10,000 additional rows for an annealing phase
- 10,000 randomly selected rows for evaluation
Training Procedure
- Preprocessing: The dataset was formatted into a conversational structure with user and assistant messages.
- Training type: Supervised Fine-Tuning (SFT)
- Training regime: BFloat16 mixed precision
Training Hyperparameters
- Batch size: 8
- Gradient Accumulation steps: 4
- Learning rate: 1e-3
- Number of epochs: 1
- Max sequence length: 2048
- Warmup steps: 20
The training used a custom learning rate scheduler with an initial constant phase followed by cosine annealing.
Software and Hardware
- Software: Transformers, TRL (Transformer Reinforcement Learning), Accelerate
- Hardware: [More Information Needed]
Evaluation
Evaluation was performed on a randomly selected subset of 10,000 rows from the WebInstructSub dataset.
Metrics
[More Information Needed]
Limitations and Bias
As this model is fine-tuned on the WebInstructSub dataset, it may inherit biases present in that dataset. Additionally, as a smaller language model, it may have limitations in handling complex or highly specialized tasks compared to larger models.
Recommendations
- Thoroughly test the model's outputs before using it in any sensitive applications.
- Be aware that the model's knowledge is limited to its training data and it may produce incorrect or biased information.
- For critical applications, consider using this model in conjunction with other sources of information or larger, more comprehensive models.
How to Get Started with the Model
You can use this model with the Transformers library:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Trelis/80M-2percent-corpus-SFT")
tokenizer = AutoTokenizer.from_pretrained("Trelis/80M-2percent-corpus-SFT")
# Example usage
input_text = "What is the capital of France?"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)
- Downloads last month
- 8
Model tree for Trelis/TrelisSmolLM-instruct
Base model
HuggingFaceTB/SmolLM-360M