candidate-Llaza-MS2-FT-1BT-instruct-difflr-v2
Training checkpoint from zip2zip-core. This is a candidate model (not production-ready).
Training Config
| Field | Value |
|---|---|
| model_config | 1B |
| init_from | meta-llama/Llama-3.2-1B-Instruct |
| max_subtokens | 2 |
| max_codebook_size | 4096 |
| seq_len | 4096 |
| lr | 5e-05 |
| max_tokens | 1000000000 |
| step | 7630 |
| data | zip2zip-1B-sft-8shards-resharded |
Usage
This is a training checkpoint (torchtitan format). To use for inference, export to HuggingFace format first:
python scripts/zip2zip_hf/export_to_zip2zip.py \
--ckpt_dir <local_path>/step_7630 \
--output_dir <export_dir> \
--base_model meta-llama/Llama-3.2-1B-Instruct \
--model_config 1B
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support