26 26 12

Sherman Chann

152334H

https://152334H.github.io

152334H

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

Competitive Programming with Large Reasoning Models

liked a model 3 months ago

deepseek-ai/DeepSeek-R1

liked a model 3 months ago

deepseek-ai/DeepSeek-R1-Zero

View all activity

Organizations

152334H's activity

commented 2 papers 7 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 139 •

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 139 •

commented 2 papers 8 months ago

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9, 2024 • 13 •

OpenResearcher: Unleashing AI for Accelerated Scientific Research

Paper • 2408.06941 • Published Aug 13, 2024 • 32 •

New activity in meta-llama/Llama-3.1-405B 8 months ago

8-kv-heads

#21 opened 8 months ago by

ArthurZ

New activity in 152334H/miqu-1-70b-sf 8 months ago

Adding Evaluation Results

#23 opened 8 months ago by

leaderboard-pr-bot

commented a paper 8 months ago

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27, 2024 • 58 •

commented a paper 9 months ago

Unveiling Encoder-Free Vision-Language Models

Paper • 2406.11832 • Published Jun 17, 2024 • 54 •

commented 2 papers 10 months ago

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 24 •

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Paper • 2406.11069 • Published Jun 16, 2024 • 14 •

commented a paper about 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 612 •

142

New activity in 152334H/miqu-1-70b-sf about 1 year ago

vllm support?

#19 opened about 1 year ago by

ccYcc

Remove extra degrees of freedom by dequantizing the `q5_K_M`, `q4_K_M` and `q2_K` models together?

#18 opened about 1 year ago by

jukofyork

Sticking a restrictive license on a model that's not even yours to begin with?

#14 opened about 1 year ago by

candre23

2.4bpp exl2 waiting room

#3 opened about 1 year ago by

AUTOMATIC

Model load fail

#13 opened about 1 year ago by

wangrenzhong

Chat template

#11 opened about 1 year ago by

MaziyarPanahi

Commerical Use?

#12 opened about 1 year ago by

arhanovich

more quantized versions？

#10 opened about 1 year ago by

Liangmingxin

New activity in miqudev/miqu-1-70b about 1 year ago

An interesting yet useless consideration over the fp16 being out or not.

#21 opened about 1 year ago by

Nexesenex