Allignment Heads for Large / Med models

#25
by lasmith - opened

Do you guys have any plans to add the allignment heads to the generation_config (or can you:)?

It seems this is done on the small model but not on the med or large models.

I did find this repo which seems to run the analysis but TBH was not sure what the dataset should be used to run to calculate them best. The latest transformers requires the allignment heads to generate token level timestamps. See here. So without the allignment heads set, then word level timestamps are not possible...

lasmith changed discussion title from Attention Heads for Large / Med models to Allignment Heads for Large / Med models

Sign up or log in to comment