hai

cloudyu

AI & ML interests

Personal contributor m2 ultra 192G QQ 206 887 187

Organizations

cloudyu's activity

New activity in mistralai/Mistral-Nemo-Instruct-2407 3 days ago

mistral-chat doesn't work

6
#12 opened 3 days ago by cloudyu
New activity in mistralai/mamba-codestral-7B-v0.1 4 days ago

mamba bug

5
#4 opened 4 days ago by cloudyu
New activity in Kwai-Kolors/Kolors 16 days ago

加油老铁们

6
#2 opened 16 days ago by cloudyu
New activity in mlx-community/gemma-2-27b-it-8bit 23 days ago
New activity in cloudyu/Yi-34Bx2-MoE-60B-DPO 24 days ago
New activity in cloudyu/Mixtral_34Bx2_MoE_60B 24 days ago
New activity in cloudyu/Yi-34Bx2-MoE-60B 27 days ago
New activity in cloudyu/Meta-Llama-3-70B-Instruct-DPO 2 months ago

good model

1
#1 opened 2 months ago by gopi87
New activity in mlx-community/c4ai-command-r-plus-4bit 3 months ago

output is not correct.

1
#7 opened 3 months ago by flymonk
New activity in CohereForAI/c4ai-command-r-plus 4 months ago
New activity in wolfram/miquliz-120b-v2.0 4 months ago

VRAM Estimates

6
#3 opened 5 months ago by ernestr
New activity in cloudyu/Mixtral_11Bx2_MoE_19B 4 months ago

Hardware requirement

2
#5 opened 4 months ago by Dtree07

Adding Evaluation Results

1
#4 opened 4 months ago by ac-automata
New activity in cloudyu/Yi-34Bx2-MoE-60B 4 months ago

4x version

1
#15 opened 4 months ago by ehartford
New activity in cloudyu/mistral_pretrain_demo 5 months ago

Very interesting

1
#1 opened 5 months ago by ehartford
New activity in LayerDiffusion/layerdiffusion-v1 5 months ago

how to run this model?

3
#1 opened 5 months ago by cloudyu
New activity in yunconglong/MoE_13B_DPO 5 months ago
New activity in cloudyu/Mixtral_7Bx4_MOE_DPO 5 months ago

Train after merging?

2
#1 opened 5 months ago by adi-kmt

how to run this model

#2 opened 5 months ago by cloudyu

Upload tokenizer.model

1
#1 opened 6 months ago by Nexesenex

Upload tokenizer.model

#2 opened 6 months ago by Nexesenex

fp16

4
#1 opened 6 months ago by Nexesenex
New activity in 152334H/miqu-1-70b-sf 6 months ago
New activity in yunconglong/Mixtral_7Bx2_MoE_13B_DPO 6 months ago

Update README.md

#2 opened 6 months ago by cloudyu

Update README.md

#1 opened 6 months ago by cloudyu
New activity in mlabonne/phixtral-4x2_8 6 months ago
New activity in cloudyu/Mixtral_7Bx4_MOE_24B 6 months ago
New activity in jondurbin/truthy-dpo-v0.1 6 months ago

this is really great dataset

1
#2 opened 6 months ago by cloudyu
New activity in cloudyu/Pluto_13B_DPO 6 months ago
New activity in cloudyu/Yi-34Bx2-MoE-60B 6 months ago

vllm

2
#10 opened 6 months ago by regzhang
New activity in moreh/MoMo-72B-lora-1.8.6-DPO 6 months ago

congrat!new SOTA!

4
#1 opened 6 months ago by cloudyu
New activity in cloudyu/Yi-34Bx2-MoE-60B 6 months ago
New activity in cloudyu/Yi-34Bx2-MoE-60B 6 months ago

Multi-langua?

1
#7 opened 6 months ago by oFDz
New activity in cloudyu/Mixtral_7Bx2_MoE 6 months ago
New activity in cloudyu/Yi-34Bx2-MoE-60B 6 months ago
New activity in cloudyu/Mixtral_34Bx2_MoE_60B 6 months ago

Vram

2
#7 opened 6 months ago by DKRacingFan

source code and paper?

8
#6 opened 7 months ago by josephykwang

How does the MoE work?

3
#5 opened 7 months ago by PacmanIncarnate