3 480 341

r PRO

oceansweep

AI & ML interests

None yet

Recent Activity

liked a model about 12 hours ago

OpenGVLab/InternVideo2_5_Chat_8B

upvoted a paper 1 day ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

upvoted a paper 1 day ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

View all activity

Organizations

None yet

oceansweep's activity

upvoted 2 papers 1 day ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published 3 days ago • 104

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 3 days ago • 124

upvoted a paper 3 days ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published 6 days ago • 118

upvoted 2 papers 5 days ago

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

Paper • 2502.04404 • Published 10 days ago • 18

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published 9 days ago • 40

upvoted 2 papers 9 days ago

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published 11 days ago • 52

Jailbreaking with Universal Multi-Prompts

Paper • 2502.01154 • Published 13 days ago • 8

upvoted 4 papers 16 days ago

GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published 17 days ago • 81

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published 20 days ago • 33

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Paper • 2501.15654 • Published 21 days ago • 11

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Paper • 2501.17749 • Published 18 days ago • 13

upvoted 4 papers 19 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 21 days ago • 57

Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity

Paper • 2501.16295 • Published 20 days ago • 8

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published 21 days ago • 56

Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published 23 days ago • 50

upvoted a collection 20 days ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 3 items • Updated 20 days ago • 343

upvoted a paper 22 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published 24 days ago • 24

upvoted a paper 23 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 25 days ago • 83

upvoted a collection 23 days ago

SmolVLM 256M & 500M

Collection

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 24 days ago • 68

upvoted a paper 24 days ago

GPS as a Control Signal for Image Generation

Paper • 2501.12390 • Published 26 days ago • 12