Extract attention from model

#15
by kaustabanv - opened
  1. Is there a way to extract attention values at every BertLayer? It would be useful for interpretability if the attention values can be accessed.
  2. I'm new to transformers. Is the hidden layer output be used for explainability the same way attention is?

Thanks for your time!

https://huggingface.co/zhihan1996/DNABERT-2-117M/discussions/24

Read this, I was finally able to extract attention. I hope this will help all of us :D

Sign up or log in to comment