Shrink Llama - V1
Collection
Parts of Meta's LlamaV2 models, chopped up and trained. CoreX means the first X layers were kept.
•
2 items
•
Updated
•
2
CoreX models are Llama models in which the first X decoder layers are kept, and then the model is finetuned on 1 billion tokens from some dataset. Base model stems from Llama2-7b, medium from Llama2-13b, xl from Llama2-70b.