Text Generation
PEFT
GGUF
English
conversational

Xenith-3B

Xenith-3B is a fine-tuned language model based on the microsoft/Phi-3-mini-4k-instruct model. It has been specifically trained on the AlignmentLab-AI/alpaca-cot-collection dataset, which focuses on chain-of-thought reasoning and instruction following.

Model Overview

  • Model Name: Xenith-3B
  • Base Model: microsoft/Phi-3-mini-4k-instruct
  • Fine-Tuned On: AlignmentLab-AI/alpaca-cot-collection
  • Model Size: 3 Billion parameters
  • Architecture: Transformer-based LLM

Training Details

  • Objective: Fine-tune the base model to enhance its performance on tasks requiring complex reasoning and multi-step problem-solving.
  • Training Duration: 10 epochs
  • Batch Size: 8
  • Learning Rate: 3e-5
  • Optimizer: AdamW
  • Hardware Used: 2x NVIDIA L4 GPUs

Performance

Xenith-3B excels in tasks that require:

  • Chain-of-thought reasoning
  • Instruction following
  • Contextual understanding
  • Complex problem-solving
  • The model has shown significant improvements in these areas compared to the base model.
Downloads last month
18
GGUF
Model size
3.82B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XeroCodes/xenith-3b-gguf

Adapter
(837)
this model

Dataset used to train XeroCodes/xenith-3b-gguf