Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
victor 
posted an update Jan 12
Post
Let's play a little game, how would you build the Rabbit R1 with open source tech? Here is my stack:

- openai/whisper-small for awesome Speech-to-Text with low latency
- mistralai/Mixtral-8x7B-Instruct-v0.1 for an awesome super powerful LLM Brain
- coqui/XTTS-v2 for a nice and clean voice

Which stack will you personally choose?

Not exactly replicatable since the absence of LAM

·

Not wrong, but you could probably build some nice web browsing agents on top of Mixtral 👀

I would also use a https://huggingface.co/microsoft/phi-2 model, we need a smaller model for quick inference for easy queries

I would use ph-2 on device for daily conversation and detecting users intention, then pass to cloud hosted much larger LLM to do more complicated stuff (such as web browsing and more).

How would you aim for the cheapest latency using existing tooling?

Are you thinking of running it on a device or in the cloud?

·

i would do the LLM in the cloud and the ASR+TTS on device

wdyt @victor ?

or a https://huggingface.co/distil-whisper/distil-large-v2 for even faster speech-to-text

·

Cannot access the demo 👀

Same models but with SOLAR 10.7B Instruct, as it has similar performance with less parameters

We are starting like those at pi5, we called 'piBrain'

whisper-tiny - tinyllama - vits

everyone is focusing on the model side, but what about the hardware, id be interested in seeing some Open source versions of this by the community.

·

On the Rabbit, all the heavy compute is done remotely I think 👀