# Multipack | |
4k context, bsz =4, | |
each character represents 256 tokens | |
X represents a padding token | |
``` | |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | |
[[ A A A A A A A A A A A ] | |
B B B B B B ] | |
C C C C C C C ] | |
D D D D ]] | |
[[ E E E E E E E E ] | |
[ F F F F ] | |
[ G G G ] | |
[ H H H H ]] | |
[[ I I I ] | |
[ J J J ] | |
[ K K K K K] | |
[ L L L ]] | |
``` | |
after padding to longest input in each step | |
``` | |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | |
[[ A A A A A A A A A A A ] | |
B B B B B B X X X X X X ] | |
C C C C C C C X X X X ] | |
D D D D X X X X X X X ]] | |
[[ E E E E E E E E ] | |
[ F F F F X X X X ] | |
[ G G G X X X X X ] | |
[ H H H H X X X X ]] | |
[[ I I I X X ] | |
[ J J J X X ] | |
[ K K K K K ] | |
[ L L L X X ]] | |
``` | |
w packing ( note it's the same effective number of tokens per step, but a true bsz of 1) | |
``` | |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | |
[[ A A A A A A A A A A A B B B B B | |
B C C C C C C C D D D D E E E E | |
E E E E F F F F F G G G H H H H | |
I I I J J J J K K K K K L L L X ]] | |
``` | |