β¨ Unified 3D generation & text understanding. β¨ 3D meshes as plain text for seamless LLM integration. β¨ High-quality 3D outputs rivaling specialized models.
Very few people realize that most of the successful AI startups got successful because they were focused on open science and open-source for at least their first few years. To name but a few, OpenAI (GPT, GPT2 was open-source), Runway & Stability (stable diffusion), Cohere, Mistral and of course Hugging Face!
The reasons are not just altruistic, it's also because sharing your science and your models pushes you to build AI faster (which is key in a fast-moving domain like AI), attracts the best scientists & engineers and generates much more visibility, usage and community contributions than if you were 100% closed-source. The same applies to big tech companies as we're seeing with Meta and Google!
More startups and companies should release research & open-source AI, it's not just good for the world but also increases their probability of success!
We conducted an experiment in an effort to revive LLaMA 1 33B as it had unique prose and a lack of "GPT-isms" and "slop" in its pretraining data, as well as being one of the favorites at the time. With multiple finetune runs, we were able to extend the model from it's pretrained base of 2048 to ~12,000 tokens adding approx. 500M tokens in the process. The effective length is 16,384 but it's better to keep it on the lower range. It writes well and in multiple formats. In the future, we have some ideas like implementing GQA. Please take a look and we would love to hear your feedback!
βWith the rise of recent interest in Vision Language Models (VLMs), we decided to make a push to include an ImageField within Argilla! This means any open source developer can now work on better models for vision ML tasks too and we would like to show you how.
βWe would love to introduce this new feature to you, so we've prepared a set of notebooks to go over some common image scenarios. finetune an CLIP retrieval model with sentence transformers use ColPali+ Qwen VL for RAG and log the results to Argilla image-generation preference: creating multi-modal preference datasets for free using Hugging Face inference endpoints.
πΎπ§ How much VRAM will you need for training your AI model? πΎπ§ Check out this app where you convert: Pytorch/tensorflow summary -> needed VRAM or Parameter count -> needed VRAM
And everything is open source! Ask for new functionalities or contribute in: https://github.com/AlexBodner/How_Much_VRAM If it's useful to you leave a star πand share it to someone that will find the tool useful!