Excellent Approach

by 1littlecoder - opened Jun 2, 2024

Discussion

1littlecoder

Jun 2, 2024

Thanks for sharing this, How's the performance vibe (other than the benchmarks) ?

pszemraj

Owner Jun 3, 2024

Thanks! I'd say it's pretty performant, but IMO it's best to think of this as a tradeoff between speed and performance. You get some memory savings from the reduced number of params, but the main benefit is faster training and inference from removing 6 (~20%) of the layers.

Note that this is the base model-not the instruct-so it needs some fine-tuning before practical use.

Amrrs

Jun 4, 2024

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment