Saraansh Tandon
Satandon1999
AI & ML interests
None yet
Organizations
None yet
Satandon1999's activity
Out of resource: shared memory
6
#16 opened 7 months ago
by
iszhaoxin
A lot of <unk> generations in the cuda int 4 model.
1
#12 opened 6 months ago
by
Satandon1999
Slower generation with multi-batch size.
#26 opened 6 months ago
by
Satandon1999
Model not working with accelerate for inference.
1
#25 opened 6 months ago
by
Satandon1999
Fix for FlashAttention RuntimeError & Triton Multi GPU fix.
1
#17 opened 7 months ago
by
Satandon1999
Multi-GPU case device mismatch while finetuning.
3
#19 opened 7 months ago
by
Satandon1999
RuntimeError: FlashAttention only support fp16 and bf16 data type
1
#15 opened 7 months ago
by
Satandon1999