@poedator About Sequences packing in SFT (supervised finetuning) training, do you have any example script? If you have it, could you provide it? Thank you very much.
Don yang Lin
dylin7
ยท
AI & ML interests
Large language models
Recent Activity
commented on
an
article
about 1 month ago
4D masks support in Transformers
commented on
an
article
about 1 month ago
Improving Hugging Face Training Efficiency Through Packing with Flash Attention
liked
a Space
4 months ago
akhaliq/anycoder
Organizations
dylin7's activity
commented on
4D masks support in Transformers
about 1 month ago
commented on
Improving Hugging Face Training Efficiency Through Packing with Flash Attention
about 1 month ago