dame rajee

damerajee

AI & ML interests

None yet

Organizations

Posts 2

view post
Post
392
On the 2nd of October a really cool paper was released called "Were RNNs all we need" https://arxiv.org/abs/2410.01201

This paper introduces the MinGRU model, a simplified version of the traditional Gated Recurrent Unit (GRU) designed to enhance efficiency by removing hidden state dependencies from its gates. This allows for parallel training, making it significantly faster than conventional GRUs. Additionally, MinGRU eliminates non-linear activations like tanh, streamlining computations.

So I read the paper and I tried training this model and it seems to be doing quite well , you could check out the pre-trained model on the huggingface spaces

- damerajee/mingru-stories
view post
Post
1830
Just released ViLaH - a compact 3B parameter vision language model! which generates responses in Hindi only hindi for now 😔

BhashaAI/ViLaH