[24/ 11] What are you working on this week! ๐Ÿ’ช

#2
by reach-vb HF staff - opened
open/ acc org

What are you working on this week? Put down your goals, projects, let's build in public!

open/ acc org

A tool calling tutorial for Llama models, together with @sergiopaniego and @ariG23498 ! How about you?

open/ acc org

Working on revamping the Open ASR Leaderboard this week and have better support for Audio tasks on the Hub with @Steveeeeeeen ! ๐Ÿ”ฅ

open/ acc org

Working on an open dataset of image preferences to build an open source flux/ sd like model! ๐Ÿ’

Will take more than a week but working on training a small LLM with speech to speech support without pipelining of multiple models. USP we are trying to have is to add support of regional languages in India. Idea is reduce the cost of speech to speech use cases by creating a single model that can achieve the quality output as a pipeline with multiple models.

Current approach we are taking is understanding similar models like Moshi and Ichigo and replicate their architecture.

Still have a learning curve in this space so open to suggestions on the implementation

@EshitaYalawatti

open/ acc org

Hey @cyb3rh4wk - have you looked at our cascaded speech-to-speech pipeline: https://github.com/huggingface/speech-to-speech - it's cascaded but it's quite optimised for both on-device + CUDA backend.

open/ acc org

Working through the llm-book, currently understanding State Space Models and Mamba under the hood.

Also working on an AI powered cross-platform System Architecture / DevOps design assistant with diagram generation and architecture migration (aws -> gcp, azure -> aws etc.) features.

Hoping to introduce terraform script generation in the near future !!

PS: Do checkout BigBanyanTree, a collection of 7 datasets extracted and processed from CommonCrawl using Spark, used for some general analysis. ( not AI related )

Working on training dataset of marketing creative segmentation to build creative analysis model @hawky-ai-labs ๐Ÿฆ…, if interested join me!!!

open/ acc org

Working on AI Text 2 SQL for the SQL Console and a library for sending LLM traces to HF Datasets! ๐Ÿ‘€

trying to implement DiVA in PyTorch and train on my cantonese asr dataset, if not, gonna try ultravox

Working on a replication of the data collection flow of OpenAI's SimpleQA dataset using Hugging Face tools (Gradio, Argilla, Distilabel)

1.png

Working on training dataset of marketing creative segmentation to build creative analysis model @hawky-ai-labs ๐Ÿฆ…, if interested join me!!!
@Sri-Vigneshwar-DJ would love to know more about how you're building this dataset

open/ acc org

I'm looking into building an agentic data labeling / cleaning pipeline, Adala already does this for NER / semantic analysis in text-only datasets, but I'm thinking multimodal, and audio to begin with, so I may work on implementing it inside this repo.
The idea is that users could load any audio dataset and ASR model from HF (hosted locally or on the cloud) and then have the ASR model simultaneously go through all of the audio files and transcribe them, and show a final diff view of the old original labels vs the newly generated ones where you can play the audio and compare the two transcriptions and accept / refuse either one to end up with a final cleaned dataset..
the next step would be to send these diffs to another QC agent that would compare them and do a final edit based on the style guidelines/ common issues that it has been given in its system prompt, to decrease the burden of manual checking. we can also have a confidence score for the QC agent and have it flag / send back all the transcripts that get a score below 0.6 for example, to be viewed by the user.. but this could wait until we get the main functionality working.

Would love any kind of input / colab from the community on this! ๐Ÿค—

An open source version of Notebook LM

Working on an astronomical object classification side project ๐Ÿช

Sign up or log in to comment