DataComp

non-profit

https://www.datacomp.ai/dclm/index.html#home

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

weizhiwang authored a paper 1 day ago

Adaptive Layer-skipping in Pre-trained LLMs

weizhiwang authored a paper 1 day ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

richardaecn authored a paper 11 days ago

Unified Visual Relationship Detection with Vision and Language Models

View all activity

dclm's activity

weizhiwang

authored 2 papers 1 day ago

Adaptive Layer-skipping in Pre-trained LLMs

Paper • 2503.23798 • Published 4 days ago

Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources

Paper • 2504.00595 • Published 2 days ago • 25

thomwolf

posted an update 4 days ago

Post

2736

The new DeepSite space is really insane for vibe-coders
enzostvs/deepsite

With the wave of vibe-coding-optimized LLMs like the latest open-source DeepSeek model (version V3-0324), you can basically prompt out-of-the-box and create any app and game in one-shot.

It feels so powerful to me, no more complex framework or under-the-hood prompt engineering to have a working text-to-app tool.

AI is eating the world and *open-source* AI is eating AI itself!

PS: and even more meta is that the DeepSite app and DeepSeek model are both fully open-source code => time to start recursively improve?

PPS: you still need some inference hosting unless you're running the 600B param model at home, so check the very nice list of HF Inference Providers for this model: deepseek-ai/DeepSeek-V3-0324

1 reply

richardaecn

authored 13 papers 11 days ago

Unified Visual Relationship Detection with Vision and Language Models

Paper • 2303.08998 • Published Mar 16, 2023

The iNaturalist Species Classification and Detection Dataset

Paper • 1707.06642 • Published Jul 20, 2017

Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

Paper • 2305.06324 • Published May 10, 2023 • 1

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

Paper • 2004.12276 • Published Apr 26, 2020 • 1

Spatiotemporal Contrastive Video Representation Learning

Paper • 2008.03800 • Published Aug 9, 2020

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Paper • 2012.07177 • Published Dec 13, 2020

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

Paper • 2302.06235 • Published Feb 13, 2023

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 30

Edify 3D: Scalable High-Quality 3D Asset Generation

Paper • 2411.07135 • Published Nov 11, 2024

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published 16 days ago • 44

thomwolf

posted an update 22 days ago

Post

2721

We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

orionweller

authored a paper 23 days ago

Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning

Paper • 2503.04973 • Published 28 days ago • 21

Wanfq

authored a paper 28 days ago

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Paper • 2503.04222 • Published 29 days ago • 14

AmeyaPrabhu

authored a paper about 1 month ago

A Practitioner's Guide to Continual Multimodal Pretraining

Paper • 2408.14471 • Published Aug 26, 2024

AI & ML interests

Recent Activity

Team members 88

dclm's activity