9 4 3

Siri

Shamane

https://www.linkedin.com/in/shamane-siriwardhana/

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 months ago

arcee-train/rl-instruction-filtered

updated a dataset about 2 months ago

arcee-train/rl-instruction

View all activity

Organizations

Shamane's activity

updated 2 datasets about 2 months ago

arcee-train/rl-instruction-filtered

Viewer • Updated Sep 26 • 4.14k • 33

arcee-train/rl-instruction

Viewer • Updated Sep 26 • 7.5k • 32

upvoted a paper 2 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135

updated a model 2 months ago

arcee-train/shamane-9-12-untrained-merge

Text Generation • Updated Sep 12 • 11

updated 2 datasets 2 months ago

arcee-train/no-base-combined-dataset

Viewer • Updated Sep 12 • 1.73k • 30

arcee-train/9-2-combined-dataset

Viewer • Updated Sep 11 • 1.73k • 33

updated a model 3 months ago

arcee-train/untrained-merged-random-coeffs

Text Generation • Updated Sep 10 • 8

updated a dataset 3 months ago

arcee-train/my-combined-dataset

Viewer • Updated Sep 10 • 1.73k • 33

updated 2 models 3 months ago

arcee-train/pplist-merged-untrained-with-base-layernorm-embedding

Text Generation • Updated Sep 10 • 7

arcee-train/pplist-merged-untrained-with-base

Text Generation • Updated Sep 5 • 369

updated a dataset 3 months ago

arcee-train/logits-dataset-full-set-top-50

Viewer • Updated Aug 30 • 1.73k • 38 • 1

updated a model 5 months ago

arcee-ai/Llama-3-SEC-Chat

Text Generation • Updated Jun 20 • 57 • 34

liked a model 5 months ago

arcee-ai/Llama-3-SEC-Chat

Text Generation • Updated Jun 20 • 57 • 34

updated 2 models 6 months ago

arcee-ai/cpt-16B-auto-sft-ties-post-merge-auto-dpo

Text Generation • Updated Jun 10 • 10

arcee-ai/Mixtral-8x7B-Instruct-v0.1-Finance

Updated May 16

updated a model 7 months ago

arcee-ai/teeny-tiny-mixtral

Text Generation • Updated Apr 23 • 14 • 1

liked a model 7 months ago

chargoddard/llama3-42b-v0

Text Generation • Updated Apr 24 • 391 • 117

New activity in jetmoe/jetmoe-8b 7 months ago

When can we have the training code as illustrated in the paper.

#5 opened 7 months ago by

Shamane

New activity in jetmoe/jetmoe-8b-chat 7 months ago

Seems like still we can't load this model with the Transformers library?

#2 opened 8 months ago by

Shamane

New activity in arcee-ai/Mistral-7B-Instruct-v0.2-sliced-24-layer 8 months ago

Why is the size of pruned model bigger than the original ones after 24 layers been sliced?

#1 opened 8 months ago by

iheardyoulooking