Text Generation
Transformers
PyTorch
English
llama
llama 2
smol_llama
Inference Endpoints
text-generation-inference
Edit model card

smol_llama-220M-GQA-32k-theta-sft

Experimental model meant to serve as a long-context speculative decoding model.

Created using Doctor-Shotgun/smol_llama-220M-GQA-32k-theta and finetuning at 32768 context length on several instruction datasets.

This variant uses the rope theta (rope frequency base) method for context extension.

The trained instruction format is Alpaca:

### Instruction:
{{instruction}}

### Input:
{{user input}}

### Response:
{{model response}}
Downloads last month
6

Datasets used to train Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft

Collection including Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft