Tomer Ronen
tomer-nv
·
AI & ML interests
None yet
Recent Activity
authored
a paper
16 days ago
FFN Fusion: Rethinking Sequential Computation in Large Language Models
authored
a paper
4 months ago
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Organizations
tomer-nv's activity
Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model
#19 opened 6 months ago
by
tomer-nv
Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model
#18 opened 6 months ago
by
tomer-nv
fixed cache over-alloc bug
#17 opened 6 months ago
by
tomer-nv