Uplimit

company

AI & ML interests

None defined yet.

Uplimit's activity

davidberenstein1957Β 
posted an update 15 days ago
view post
Post
3196
πŸš€ Find banger tools for your smolagents!

I created the Tools gallery, which makes tools specifically developed by/for smolagents searchable and visible. This will help with:
- inspiration
- best practices
- finding cool tools

Space: davidberenstein1957/smolagents-and-tools
  • 1 reply
Β·
davidberenstein1957Β 
posted an update 16 days ago
davidberenstein1957Β 
posted an update 21 days ago
davidberenstein1957Β 
posted an update 22 days ago
davidberenstein1957Β 
posted an update 23 days ago
davidberenstein1957Β 
posted an update 28 days ago
view post
Post
1624
tldr; Parquet is awesome, DuckDB too!

Datasets on the Hugging Face Hub rely on parquet files. We can interact with these files using DuckDB as a fast in-memory database system. One of DuckDB’s features is vector similarity search which can be used with or without an index.

blog:
https://huggingface.co/learn/cookbook/vector_search_with_hub_as_backend
davidberenstein1957Β 
posted an update about 1 month ago
davidberenstein1957Β 
posted an update about 1 month ago
davidberenstein1957Β 
posted an update about 1 month ago
davidberenstein1957Β 
posted an update about 1 month ago
davidberenstein1957Β 
posted an update about 2 months ago
davidberenstein1957Β 
posted an update about 2 months ago
davidberenstein1957Β 
posted an update 2 months ago
davidberenstein1957Β 
posted an update 2 months ago
view post
Post
4229
Introducing the Synthetic Data Generator, a user-friendly application that takes a no-code approach to creating custom datasets with Large Language Models (LLMs). The best part: A simple step-by-step process, making dataset creation a non-technical breeze, allowing anyone to create datasets and models in minutes and without any code.

Blog: https://huggingface.co/blog/synthetic-data-generator
Space: argilla/synthetic-data-generator
  • 4 replies
Β·
davidberenstein1957Β 
posted an update 3 months ago
view post
Post
2085
Open Preference Dataset for Text-to-Image Generation by the πŸ€— Community

Open Image Preferences is an Apache 2.0 licensed dataset for text-to-image generation. This dataset contains 10K text-to-image preference pairs across common image generation categories, while using different model families and varying prompt complexities.

https://huggingface.co/blog/image-preferences
davidberenstein1957Β 
posted an update 3 months ago
view post
Post
1191
This is amazing for cheap models fine-tunes without the hassle of actual deployment! TIL: LoRA fine-tunes for models on the Hub can directly be used for inference!


davidberenstein1957Β 
posted an update 3 months ago
view post
Post
3473
The Data Is Better Together community is set to release the first Apache 2 licensed image preference dataset!

Great work and let's give this a final push :)

@aashish1904 congrats on your month of HF pro. There is more to win during this sprint!

@aashish1904 @AnyaDesdein @davidberenstein1957 @Malalatiana @beta3 @fffiloni @munish0838 @Reza2kn @bbunzeck @Creazycreator @andrei-saceleanu @jafhaponiuk @rca-etl @kf120 @burtenshaw @mmhamdy @grib0ed0v @Doopus @AnyaDes @ttkap @Xceron @Lewox @davanstrien @Azazelle @adirik @Ashish08 @AntonVic @kenantang @sdiazlor @g-ronimo @dennis-rall @prithivMLmods @girtss3 @flozi00 @WaveCut @Taylor658 @Wildminder @Sara9999 @phaelishall @sararob @dvilasuero @pgabrys @plaguss @CDS899 @timajwilliams @rudzinskimaciej @pavel-ai @aggr8 @ignacioct @MouseAI @Leeps @MaksKul @NicolasDmln @Muinez @kusht55 @caiolang @Jakub-Brand24 @loamy @Demijan @eliab96 @Viewegger @JosephCatrambone @p1atdev @mrshu @o639 @Targezed @Aviv-anthonnyolime @thliang01 @Ahmed-Amine @glards @pranaykoppula @nataliaElv @MaPirlet @alvarobartt @gabrielmbmb @zlicastro @Jaydip @Chouettecheveche @lilcheaty @ruyrdiaz @robintema @fdaudens @ggcristian @a-r-r-o-w @pates @joheras @stopsatgreen @bezo97 @chachi902 @iamyann @liamcripwell @dmb23 @korbih @anonymous7743 @akbdx18 @OVAWARE @severo @akontra @lichorosario @lhoestq @SebastianBodza @Vishnou @ameerazam08 @appoose @Mukei @mearco @joaquincabezas @Fizzarolli @thomastraum @igortopolski @OxxoCodes @patrickfleith @asoria @bn22 @sitammeur @Krodolf @bergr7f @Sbxxn @wietsevenema @sugatoray @Iamladi @MikeTrizna @feveromo @mokady @Bolero @prath @Dowwie @kfahn @decodingchris @alili2050 @RahulRaman @yzimmermann @Ameeeee @ecyht2 @MattMC001 @hemanthkumarak @Thegorgibus @akos2 @LawRun @ramithuh @SuperMuel @sjans @peterizsak @mosama @Eyel @mtr3 @cfahlgren1 @legentil @clem @Citaman @Aurelien-Morgan @AntoineBourgois @TotoB12 @Stanmey @osanseviero @multimodalart @maxiw @ariG23498 @ngk89 @femboysLover @dvs @tacohiddink @blanchon @DavidJimenez
  • 1 reply
Β·
davidberenstein1957Β 
posted an update 3 months ago
view post
Post
1595
πŸ”₯ Dataset Drop - Open Image Preferences

BlackForest Labs Flux Dev VS. Stability AI Stable Diffusion Large 3.5

Together with the ⁠data-is-better-together community, we've worked on an Apache 2.0 licensed open image preference dataset based on the fal ai imgsys prompts dataset. Thanks to the awesome community, we have managed to get 5K preference pairs in less than 2 days. The annotation alignment among annotators is great too.

Aashish Kumar won a month of Hugging Face Pro by making the most contributions! Congrats from the entire team πŸ₯‡

The best thing?! We are not done yet! Let's keep the annotations coming for 5K more in the second part of the sprint! (with more prices to go around).

Dataset: https://huggingface.co/datasets/data-is-better-together/image-preferences-results
davidberenstein1957Β 
posted an update 3 months ago
view post
Post
1718
Let’s make a generation of amazing image-generation models

The best image generation models are trained on human preference datasets, where annotators have selected the best image from a choice of two. Unfortunately, many of these datasets are closed source so the community cannot train open models on them. Let’s change that!

The community can contribute image preferences for an open-source dataset that could be used for building AI models that convert text to image, like the flux or stable diffusion families. The dataset will be open source so everyone can use it to train models that we can all use.

Blog: https://huggingface.co/blog/burtenshaw/image-preferences
davidberenstein1957Β 
posted an update 3 months ago