TOFU SFT Alpaca Models fine-tuned on Alpaca dataset with TOFU objective. TOFU-SFT/Mistral-Nemo-Base-2407-4bit-alpaca-sft-tofu Updated 24 days ago TOFU-SFT/phi-4-4bit-alpaca-sft-tofu Text Generation • Updated 24 days ago TOFU-SFT/pythia-12b-4bit-alpaca-sft-tofu Updated 24 days ago TOFU-SFT/Llama-3.1-8B-4bit-alpaca-sft-tofu Text Generation • Updated 24 days ago
Quantized Models Quantized versions of models used in the experiments. TOFU-SFT/Meta-Llama-3-70B-Instruct-4bit Text Generation • 71B • Updated 21 days ago • 100 TOFU-SFT/Mistral-Nemo-Base-2407-4bit 12B • Updated 24 days ago • 85 TOFU-SFT/OLMo-2-1124-13B-4bit 14B • Updated 24 days ago • 62 TOFU-SFT/pythia-12b-4bit 12B • Updated 24 days ago • 78
TOFU SFT Alpaca Models fine-tuned on Alpaca dataset with TOFU objective. TOFU-SFT/Mistral-Nemo-Base-2407-4bit-alpaca-sft-tofu Updated 24 days ago TOFU-SFT/phi-4-4bit-alpaca-sft-tofu Text Generation • Updated 24 days ago TOFU-SFT/pythia-12b-4bit-alpaca-sft-tofu Updated 24 days ago TOFU-SFT/Llama-3.1-8B-4bit-alpaca-sft-tofu Text Generation • Updated 24 days ago
Quantized Models Quantized versions of models used in the experiments. TOFU-SFT/Meta-Llama-3-70B-Instruct-4bit Text Generation • 71B • Updated 21 days ago • 100 TOFU-SFT/Mistral-Nemo-Base-2407-4bit 12B • Updated 24 days ago • 85 TOFU-SFT/OLMo-2-1124-13B-4bit 14B • Updated 24 days ago • 62 TOFU-SFT/pythia-12b-4bit 12B • Updated 24 days ago • 78