candidate-Llaza-MS2-FT-1BT-instruct-difflr-v2

Training checkpoint from zip2zip-core. This is a candidate model (not production-ready).

Training Config

Field Value
model_config 1B
init_from meta-llama/Llama-3.2-1B-Instruct
max_subtokens 2
max_codebook_size 4096
seq_len 4096
lr 5e-05
max_tokens 1000000000
step 7630
data zip2zip-1B-sft-8shards-resharded

Usage

This is a training checkpoint (torchtitan format). To use for inference, export to HuggingFace format first:

python scripts/zip2zip_hf/export_to_zip2zip.py \
    --ckpt_dir <local_path>/step_7630 \
    --output_dir <export_dir> \
    --base_model meta-llama/Llama-3.2-1B-Instruct \
    --model_config 1B
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support