Attention head
#8
by
sapkal
- opened
Can anyone help me with on how to get output of each and every attention head for the attention mechanism, whenever I try print them I just get the output of hidden layer instead.
Thank you.
@sapkal
Hi, I have updated the model code to support output_attentions
.
import torch
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained('Alibaba-NLP/gte-base-en-v1.5')
model = AutoModel.from_pretrained('Alibaba-NLP/gte-base-en-v1.5', attn_implementation='eager', trust_remote_code=True)
inputs = tokenizer(['We can output attention probs'], padding=True, return_tensors='pt')
with torch.no_grad():
output = model(**inputs, output_attentions=True)
print(output.attentions)
output:
tensor([[[[5.0676e-01, 2.7286e-02, 4.1410e-02, 1.6986e-02, 2.7379e-02,
2.3072e-02, 1.7634e-02, 3.3948e-01],
[2.0693e-01, 1.0816e-02, 2.7383e-02, 3.7302e-01, 1.0329e-01,
2.4032e-02, 6.6588e-02, 1.8795e-01],
[4.1867e-01, 1.7610e-02, 8.1962e-03, 1.8299e-01, 7.0752e-02,
4.7986e-03, 3.5255e-02, 2.6173e-01],
......