Lavanya KV
lkv
·
AI & ML interests
None yet
Organizations
lkv's activity
Gemma 2 - 2B planing
3
#20 opened 9 days ago
by
baohuynhbk14
Very high loss compared to keras
6
#46 opened 5 months ago
by
tanimazsin130
Shutting down servers during fine-tuning
2
#73 opened 4 months ago
by
yjok0220
What is the max sequence length that model can compute if I use flash attention?
1
#20 opened 3 months ago
by
halfmoon039