QuietImpostor's picture
Update README.md
997186e verified
---
base_model:
- refuelai/Llama-3-Refueled
library_name: transformers
tags:
- mergekit
- merge
license: llama3
datasets:
- yahma/alpaca-cleaned
language:
- en
---
### Pruning Details
This is a prune of [Llama 3 Refueled](https://www.huggingface.co/refuelai/llama-3-refueled) using [mergekit](https://github.com/cg123/mergekit) and [PruneMe](https://www.github.com/arcee-ai/PruneMe)
The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.
Note: the [dataset](https://www.huggingface.co/yahma/alpaca-cleaned) was used for evaluating what layers should be pruned. This model was **NOT** finetuned.
### Performance
After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results.
Here's the response after being prompted "Hi!" using the [example from Meta](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3).
```model_response
vel tips and recommendations.user
Hi!assistant
Hi! I can help you find the best travel tips and recommendations for your next trip. Where you most interested to travel and what kind of activities you most to to the 9e sure, we can start and letiing 10e 11e 12e 13e 14e 15e 16e 17e 18e 19e 20e 21e 23e 24e 5e 6e 7e 8e 9e 10e 11e 12e 13e 14e 15e
```
Even without finetuning, the model still exhibits some extent of instruction following.
And fine-tuning is a WIP and I will update this when it's ready.
Finetuning is no longer in progress due to issues with unsloth. However, I am working on a project that will hopefully make pruning models easier.
### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
- sources:
- model: refuelai/Llama-3-Refueled
layer_range: [0, 19]
- sources:
- model: refuelai/Llama-3-Refueled
layer_range: [29, 32]
merge_method: passthrough
dtype: bfloat16
```