Instructions to use yuxiaoyang/opsd-llama31-8b-instruct-origin-gen1024-step200-jsdclip006-20260515 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use yuxiaoyang/opsd-llama31-8b-instruct-origin-gen1024-step200-jsdclip006-20260515 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base_model, "yuxiaoyang/opsd-llama31-8b-instruct-origin-gen1024-step200-jsdclip006-20260515") - Notebooks
- Google Colab
- Kaggle
opsd-llama31-8b-instruct-origin-gen1024-step200-jsdclip006-20260515
This public repository contains LoRA adapter checkpoints from an OPSD training run.
Method
- Base model:
meta-llama/Llama-3.1-8B-Instruct - Method: OPSD origin fixed-teacher full-vocabulary JSD with per-token clipping
- Teacher: fixed base policy with LoRA adapters disabled during teacher forward passes
- Loss: full-vocabulary forward KL/JSD beta=0
- Per-token JSD clipping:
0.06 - Student/teacher thinking flags:
False / True - Dataset:
siyanzhao/Openthoughts_math_30k_opsd - Train budget:
max_steps=200,max_completion_length=1024 - Batch:
per_device_train_batch_size=1,gradient_accumulation_steps=2, effective batch8 - vLLM:
colocate, GPU memory utilization0.35 - GPUs: 4
Only adapter/checkpoint artifacts and logs are uploaded; optimizer states are intentionally omitted.
- Downloads last month
- 19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for yuxiaoyang/opsd-llama31-8b-instruct-origin-gen1024-step200-jsdclip006-20260515
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct