README.md · QuietImpostor/Llama-3-Refueled-Pruned at main

metadata

base_model:
  - refuelai/Llama-3-Refueled
library_name: transformers
tags:
  - mergekit
  - merge
license: llama3
datasets:
  - yahma/alpaca-cleaned
language:
  - en

Pruning Details

This is a prune of Llama 3 Refueled using mergekit and PruneMe The model is semi-tested, but still needs some debugging, namely with converting to GGUF, though I am working on that.

Note: the dataset was used for evaluating what layers should be pruned. This model was NOT finetuned.

Performance

After only 1 test because of lack of compute and for stupid long inference times on my 3060ti (8GB), it does show some interesting results. Here's the response after being prompted "Hi!" using the example from Meta.

 vel tips and recommendations.user
Hi!assistant
Hi! I can help you find the best travel tips and recommendations for your next trip. Where you most interested to travel and what kind of activities you most to to the 9e sure, we can start and letiing 10e 11e 12e 13e 14e 15e 16e 17e 18e 19e 20e 21e 23e 24e 5e 6e 7e 8e 9e 10e 11e 12e 13e 14e 15e

Even without finetuning, the model still exhibits some extent of instruction following. And fine-tuning is a WIP and I will update this when it's ready. Finetuning is no longer in progress due to issues with unsloth. However, I am working on a project that will hopefully make pruning models easier.

Configuration

The following YAML configuration was used to produce this model:

slices:
  - sources:
      - model: refuelai/Llama-3-Refueled
        layer_range: [0, 19]
  - sources:
      - model: refuelai/Llama-3-Refueled
        layer_range: [29, 32]

merge_method: passthrough
dtype: bfloat16