Edit model card

Model Card: Llama3-8B-SFT-Tatsu-Lab-Alpaca-BnB-4bit

Model Information

  • Model Name: Llama3-8B-SFT-Tatsu-Lab-Alpaca-BnB-4bit
  • Model Version: 1.0
  • Model Type: Transformer-based Language Model
  • Framework/Libraries Used: Bits and Bytes (for quantization), SFT (for fine-tuning)

Model Description

This variant of the Meta-Llama-3-8B model has been 4-bit quantized using the Bits and Bytes library. It has also been fine-tuned for instructions using the SFT (Semantic Fine-Tuning) method. The model's training data includes the tatsu-lab alpaca dataset.

Intended Use

The model is designed for tasks involving instruction-based language processing, such as generating and understanding instructions in various contexts. It can be used for tasks like instruction understanding, instruction generation, and similar natural language instruction tasks.

Training Data

  • Dataset Name: tatsu-lab/alpaca
  • Dataset Description: The tatsu-lab alpaca dataset is a collection of instructions in natural language, designed for training and evaluating models on instruction-based tasks.

Model Performance

  • Evaluation Metrics: TBD (to be determined based on model evaluation)
  • Benchmark Results: TBD (to be determined based on benchmarking against relevant tasks/datasets)

Ethical Considerations

  • Bias Evaluation: TBD (to be determined based on bias analysis)
  • Safety Considerations: TBD (to be determined based on safety analysis)
  • Privacy Considerations: TBD (to be determined based on privacy analysis)
  • Fairness Considerations: TBD (to be determined based on fairness analysis)

Limitations and Considerations

  • The model's performance may vary depending on the specific task and domain.
  • Quantization and fine-tuning techniques may introduce trade-offs in accuracy and efficiency.
  • Evaluation on diverse datasets and tasks is recommended for comprehensive assessment.
Downloads last month
11
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train thesven/llama3-8B-sft-tatsu-lab-alpaca-bnb-4bit