QuietImpostor
/

Llama-3-Refueled-Pruned

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Llama-3-Refueled-Pruned / README.md

QuietImpostor's picture

Update README.md

997186e verified 23 days ago

|

raw history blame contribute delete

No virus

1.95 kB

	---
	base_model:
	- refuelai/Llama-3-Refueled
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: llama3
	datasets:
	- yahma/alpaca-cleaned
	language:
	- en
	---
	### Pruning Details

	This is a prune of [Llama 3 Refueled](https://www.huggingface.co/refuelai/llama-3-refueled) using [mergekit](https://github.com/cg123/mergekit) and [PruneMe](https://www.github.com/arcee-ai/PruneMe)
	The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.

	Note: the [dataset](https://www.huggingface.co/yahma/alpaca-cleaned) was used for evaluating what layers should be pruned. This model was NOT finetuned.

	### Performance
	After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results.
	Here's the response after being prompted "Hi!" using the [example from Meta](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3).

	```model_response
	vel tips and recommendations.user
	Hi!assistant
	Hi! I can help you find the best travel tips and recommendations for your next trip. Where you most interested to travel and what kind of activities you most to to the 9e sure, we can start and letiing 10e 11e 12e 13e 14e 15e 16e 17e 18e 19e 20e 21e 23e 24e 5e 6e 7e 8e 9e 10e 11e 12e 13e 14e 15e
	```

	Even without finetuning, the model still exhibits some extent of instruction following.
	And fine-tuning is a WIP and I will update this when it's ready.
	Finetuning is no longer in progress due to issues with unsloth. However, I am working on a project that will hopefully make pruning models easier.

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: refuelai/Llama-3-Refueled
	layer_range: [0, 19]
	- sources:
	- model: refuelai/Llama-3-Refueled
	layer_range: [29, 32]

	merge_method: passthrough
	dtype: bfloat16
	```