1 87 25

Maozhou Ge

Gmc2

GHGmc2

AI & ML interests

None yet

Recent Activity

upvoted an article 14 days ago

Vision Language Models Explained

liked a dataset 14 days ago

hiyouga/geometry3k

liked a dataset 25 days ago

Dahoas/full-hh-rlhf

View all activity

Organizations

None yet

Gmc2's activity

upvoted an article 14 days ago

Article

Vision Language Models Explained

Apr 11, 2024

• 311

upvoted a collection 30 days ago

🌾Oat-Zero: Understanding R1-Zero-Like Training

Collection

5 items • Updated 12 days ago • 7

upvoted a paper about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 182

upvoted an article about 1 month ago

Article

How 🤗 Accelerate runs very large models thanks to PyTorch

Sep 27, 2022

• 11

upvoted an article 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.22k

upvoted an article 3 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 845

upvoted a paper 3 months ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 58

upvoted a paper 5 months ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 62

upvoted 3 papers 6 months ago

upvoted 2 papers 7 months ago

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

Paper • 2408.14158 • Published Aug 26, 2024 • 3

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published Sep 10, 2024 • 58

upvoted 2 papers 8 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 43

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 163

upvoted 3 papers 9 months ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 115

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 57

Scaling Diffusion Transformers to 16 Billion Parameters

Paper • 2407.11633 • Published Jul 16, 2024 • 27