view article Article π₯ Argilla 2.0: the data-centric tool for AI makers π€ By dvilasuero β’ 3 days ago β’ 25
Argilla v2.0 compatible datasets Collection Ready for rg.Dataset.from_hub(). Each dataset contains a my_dataset_name/tree/main/creation_script.py to see the fullconfig and creation pipeline. β’ 6 items β’ Updated 2 days ago β’ 1
view article Article Experimenting with Automatic PII Detection on the Hub using Presidio 24 days ago β’ 22
view article Article Wikipedia's Treasure Trove: Advancing Machine Learning with Diverse Data By frimelle β’ Jun 3 β’ 12
view article Article βοΈ π₯ Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw β’ Jun 3 β’ 23
KTO: Model Alignment as Prospect Theoretic Optimization Paper β’ 2402.01306 β’ Published Feb 2 β’ 11
Preference Datasets for KTO Collection This collection contains a list of curated preference datasets for KTO fine-tuning for intent alignment of LLMs through signals. β’ 5 items β’ Updated 3 days ago β’ 11
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback Paper β’ 2402.01391 β’ Published Feb 2 β’ 41