AlbertS

LoserCheems

AI & ML interests

None yet

Recent Activity

upvoted a collection about 8 hours ago

Doge-SLM

liked a model about 8 hours ago

JingzeShi/Doge-20M-Instruct

liked a model about 8 hours ago

JingzeShi/Doge-160M-checkpoint

View all activity

Organizations

None yet

LoserCheems's activity

upvoted a collection about 8 hours ago

Doge-SLM

Collection

Doge family of small language models • 7 items • Updated 4 days ago • 5

liked 2 models about 8 hours ago

JingzeShi/Doge-20M-Instruct

Question Answering • Updated 3 days ago • 36.9k • 3

JingzeShi/Doge-160M-checkpoint

Text Generation • Updated about 21 hours ago • 123 • 3

reacted to JingzeShi's post with 👍🤯👀 about 8 hours ago

Post

2031

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! ( JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

reacted to JingzeShi's post with 🔥 about 8 hours ago

Post

1575

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
JingzeShi/Doge-20M-checkpoint
JingzeShi/Doge-60M-checkpoint

4 replies

liked 5 models about 9 hours ago

upvoted a paper about 1 month ago

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published Dec 16, 2024 • 7

upvoted a paper 3 months ago

Cheems: Wonderful Matrices More Efficient and More Effective Architecture

Paper • 2407.16958 • Published Jul 24, 2024 • 3

updated a dataset over 1 year ago

LoserCheems/Cartoon_pictures

Updated Jul 12, 2023 • 32

liked a Space over 1 year ago

Running

798

⭐️💬

StarChat Playground

liked a model over 1 year ago

openai-community/openai-gpt

Text Generation • Updated Feb 19, 2024 • 35.4k • 250