--- base_model: - refuelai/Llama-3-Refueled library_name: transformers tags: - mergekit - merge license: llama3 datasets: - yahma/alpaca-cleaned language: - en --- ### Pruning Details This is a prune of [Llama 3 Refueled](https://www.huggingface.co/refuelai/llama-3-refueled) using [mergekit](https://github.com/cg123/mergekit) and [PruneMe](https://www.github.com/arcee-ai/PruneMe) The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that. Note: the [dataset](https://www.huggingface.co/yahma/alpaca-cleaned) was used for evaluating what layers should be pruned. This model was **NOT** finetuned. ### Performance After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results. Here's the response after being prompted "Hi!" using the [example from Meta](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3). ```model_response vel tips and recommendations.user Hi!assistant Hi! I can help you find the best travel tips and recommendations for your next trip. Where you most interested to travel and what kind of activities you most to to the 9e sure, we can start and letiing 10e 11e 12e 13e 14e 15e 16e 17e 18e 19e 20e 21e 23e 24e 5e 6e 7e 8e 9e 10e 11e 12e 13e 14e 15e ``` Even without finetuning, the model still exhibits some extent of instruction following. And fine-tuning is a WIP and I will update this when it's ready. Finetuning is no longer in progress due to issues with unsloth. However, I am working on a project that will hopefully make pruning models easier. ### Configuration The following YAML configuration was used to produce this model: ```yaml slices: - sources: - model: refuelai/Llama-3-Refueled layer_range: [0, 19] - sources: - model: refuelai/Llama-3-Refueled layer_range: [29, 32] merge_method: passthrough dtype: bfloat16 ```