Samuel Azran
SamuelAzran
·
AI & ML interests
None yet
Recent Activity
liked
a model
12 days ago
NousResearch/DeepHermes-3-Llama-3-8B-Preview
updated
a model
about 1 month ago
SamuelAzran/Llama3.1-s-base
published
a model
about 1 month ago
SamuelAzran/Llama3.1-s-base
Organizations
None yet
SamuelAzran's activity
New Gemma 2 27B?
2
#3 opened 8 months ago
by
SamuelAzran
Was it train after the latest Huggingface Transformers Gemma fix? if not any update plans?
#4 opened 12 months ago
by
SamuelAzran
Should not be called mixtral, the models made into the moe are yi based
9
#2 opened about 1 year ago
by
teknium

How does the MoE work?
3
#5 opened about 1 year ago
by
PacmanIncarnate
One or two models during inference?
3
#3 opened about 1 year ago
by
Venkman42

You know Mixtral, Llama 2 70b, GPT3.5... Are All Much Better
1
#13 opened about 1 year ago
by
deleted
Awesome- Could you help with pointers on doing same for Other languages(Swedish)?
3
#2 opened about 1 year ago
by
Olofp
QLora or full fine-tuning?
1
#1 opened about 1 year ago
by
SamuelAzran
Was system message used during training?
1
#8 opened over 1 year ago
by
SamuelAzran
NEW! OpenLLMLeaderboard 2023 fall update
20
#356 opened over 1 year ago
by
clefourrier

Did you do full model fine tuning (all layers) or only adapters?
1
#2 opened over 1 year ago
by
SamuelAzran
Can you release a chat version soon ?
11
#8 opened over 1 year ago
by
dong0213
Great work, but why only 2048 context length?
1
#4 opened over 1 year ago
by
SamuelAzran
Would it work well with sequence length > 2048?
2
#1 opened almost 2 years ago
by
SamuelAzran
Thank you very much!
10
#2 opened almost 2 years ago
by
AiCreatornator
Error running the example code
21
#6 opened almost 2 years ago
by
will33am
