How you fine tunned this model?

by celsowm - opened May 16

Discussion

celsowm

May 16

Hi!
I read on reddit that this model used a new technique to inject new domain knowledge.
Could you explain it?

lcw99

Owner May 16

•

edited May 16

You can easily create additional layers using mergekit(https://github.com/arcee-ai/mergekit). Use the following settings It is a simple task to unfreeze and train only the added layer.

slices:
  - sources:
    - model: meta-llama/Meta-Llama-3-8B-Instruct
      layer_range: [0, 20]
  - sources:
    - model: meta-llama/Meta-Llama-3-8B-Instruct
      layer_range: [12, 32]
merge_method: passthrough
dtype: bfloat16

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment