candidate-Llaza-MS2-FT-1BT-instruct-difflr-v2

Training checkpoint from zip2zip-core. This is a candidate model (not production-ready).

Training Config

Field	Value
model_config	`1B`
init_from	`meta-llama/Llama-3.2-1B-Instruct`
max_subtokens	2
max_codebook_size	4096
seq_len	4096
lr	5e-05
max_tokens	1000000000
step	7630
data	`zip2zip-1B-sft-8shards-resharded`

Usage

This is a training checkpoint (torchtitan format). To use for inference, export to HuggingFace format first:

python scripts/zip2zip_hf/export_to_zip2zip.py \
    --ckpt_dir <local_path>/step_7630 \
    --output_dir <export_dir> \
    --base_model meta-llama/Llama-3.2-1B-Instruct \
    --model_config 1B

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support