Model Card for FalconAlpaca

FalconAlpaca is Falcon-7B trained on the Stanford Alpaca Dataset

Model Details

This model was an attempt to influence the learned outputs of Falcon-7B to adapt the outputs to become more information-rich and focused. Trained using Lit GPT, the model took 2 hours to train on 1 4xA6000 node.

Model Description

  • License: [Apache 2.0]
  • Finetuned from model : Falcon-7B

Model Sources

Stanford Alpaca Dataset

Out-of-Scope Use

This model is not intended for anything but testing purposes. There have been no attempts to control/remove bias, toxicity, or any other form of potentially dangerous or harmful messages.

Bias, Risks, and Limitations

No effort was made to remove any wrong or harmful information from Falcon-7B or the Alpaca dataset. Any risks and limitations from either of those datasets/models carry over to this project as well.

How to Get Started with the Model

Download and install libraries for Lit GPT

python generate/adapter_v2.py \
    --adapter_path path/to/model/lit_model_adapter_finetuned.pth \
    --checkpoint_dir path/to/model \
    --prompt "What temperature should I cook pork at to ensure it is safe?"

This uses around 14GB of VRAM. If you need to use less VRAM, you can add the parameters

--quantize llm.int8

or

--quantize gptq.int4

Training Data

Stanford Alpaca Dataset

Training Hyperparameters

The defaults were as follows

learning_rate = 9e-3
batch_size = 32
micro_batch_size = 2
gradient_accumulation_iters = 16
epoch_size = 50000
num_epochs = 5
max_iters = 125000
weight_decay = 0.02
warmup_iters = 50000

More Information

HeitechSoft

Downloads last month
14
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.