Kudos, impressive model

#5
by fblgit - opened

Yo, amazing stuff .. speechless..
This is much more simpler than what people thinks.. definitively the community need to dig further on this and work harder. I'll be releasing some stuff using these LLaVa.

I still need some time for experimentation, but I wonder wether the vision can be reverted to project instead an embedding of a latent or merely noise|de-noise mask ? Have u tried anything such as this?

Definitively, will try to port anything of UNA into this.. i still need to figure out a bit more the code and get familiar with the architecture of LLaVa πŸ’–

fblgit changed discussion status to closed

How do you run inference? Do you have a sample script? I haven't used LLaVA before.

Sign up or log in to comment