EmmanuelCasarrubias commited on
Commit
d885235
1 Parent(s): b704567

Implement encoder layer and attention mechanism functions

Browse files

This commit introduces essential functions for building an encoder layer and implementing the attention mechanism. The encoder_layer function takes matrices Q, K transpose, and V, along with projection weight matrices Wq, Wk, and Wv as input. It performs projection operations on Q, K transpose, and V using the respective weight matrices and then applies the scaled dot-product attention mechanism using the function scaled_dot_product_attention_2D. The result is the attention scores computed for the input.

The function calcular_pesos_proyeccion(attention_matrix) calculates projection weight matrices Wq, Wk, and Wv based on the dimensions of the attention matrix. These weight matrices are initialized such that each diagonal element is set to 1, ensuring that each letter's representation is projected independently during the attention computation.

The scaled_dot_product_attention_2D function computes the attention scores by performing a scaled dot-product attention operation between Q, K transpose, and V matrices. It iterates over each row of Q, computes the attention scores, applies softmax to obtain attention weights, and then computes the weighted sum of V based on these weights.

These functions are fundamental components of transformer-based architectures used in natural language processing tasks, particularly in sequence-to-sequence models for tasks like machine translation and text summarization. They facilitate effective attention-based information aggregation and feature extraction from input sequences.

Files changed (1) hide show
  1. attention.py +27 -0
attention.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ def encoder_layer(Q, K_transpose, V, Wq, Wk, Wv):
4
+ Q_proj = np.dot(Q, Wq.T)
5
+ K_proj = np.dot(K_transpose.T, Wk.T)
6
+ V_proj = np.dot(V, Wv.T)
7
+ attention_scores = scaled_dot_product_attention_2D(Q_proj, K_proj, V_proj)
8
+ return attention_scores
9
+
10
+ def calcular_pesos_proyeccion(attention_matrix):
11
+ num_letras = attention_matrix.shape[0]
12
+ Wq = np.zeros((num_letras, attention_matrix.shape[1]))
13
+ Wk = np.zeros((num_letras, attention_matrix.shape[1]))
14
+ Wv = np.zeros((num_letras, attention_matrix.shape[1]))
15
+ for i in range(num_letras):
16
+ Wq[i, i] = 1
17
+ Wk[i, i] = 1
18
+ Wv[i, i] = 1
19
+ return Wq, Wk, Wv
20
+
21
+ def scaled_dot_product_attention_2D(Q, K_transpose, V):
22
+ out = np.zeros((Q.shape[0], V.shape[1]))
23
+ for i in range(Q.shape[0]):
24
+ scores = np.dot(Q[i, :], K_transpose)
25
+ softmax_scores = np.exp(scores - np.max(scores)) / np.sum(np.exp(scores - np.max(scores)))
26
+ out[i, :] = np.dot(softmax_scores, V)
27
+ return out