Maybe SFT for better Chat ability or RLHF or fuction call soon?

#22

by Yhyu13 - opened Nov 11, 2023

Nov 11, 2023

•

edited Nov 11, 2023

Hi, this model has answered my tricky question correctly like no other 34B models can (they would assume 1010.A.D is a future time)

BUT, I do not like its output formatting, at least Yi-34B does not follow my "step by step" reasoning instructions.

I do like its tone, but still find it is not Human-Perferred like other RLHF-ed models

BTW, the testing env is
Latest textgen-webui
Latest exllamav2
TheBloke/Yi-34B-GPTQ

Yhyu13

Nov 11, 2023

•

edited Nov 14, 2023

Here is an ouput from XWin-13B-v0.2 which was my faviorate RLHF-ed model.

Even though it has failed in finding out 1010.A.D is a past time, but I find its answer very structed, easy to read and easy to follow

Yhyu13 changed discussion title from Maybe SFT for better Chat ability or RLHF soon? to Maybe SFT for better Chat ability or RLHF or fuction call soon? Nov 11, 2023

Yhyu13

Nov 11, 2023

Also looking forwarding seeing future progress for function calling abilities.

Dudes, it is just the most essential part for recent released models to catch up with GPTs

adamo1139

Nov 11, 2023

This is base model, so I am not sure why you are expecting it to have behavior expected from chat-instruct models. 01.ai team said that they are working on chat fine-tune, it might give that assistant-like vibe. Having base pre-trained models which are not RHLFed is essential to allow later customization like RHLF. Yi model architecture make it a GPT, OpenAI doesn't have monopoly on that word.

Yhyu13 changed discussion status to closed Nov 12, 2023

Yhyu13

Nov 14, 2023

•

edited Nov 14, 2023

@adamo1139

Now there are atucally some sft chat model outthere, this one is from https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF

with textgen-webui

MODEL=Nous-Capybara-Yi-34B-200K-GPTQ
python server.py --model $MODEL \
    --loader exllamav2 \
    --max_seq_len 8192 \

The Englisht ability is way way ahead of any open source model I have seen (forgive my ignorance!), it is so prudent and high-intellegent. Though there is a weired ending token </s> but probably is due by prompt template not fully supported

Yhyu13 changed discussion status to open Nov 14, 2023

adamo1139

Nov 14, 2023

I have the issue with < /s> token being printed at the end of the reply when running my own qlora intune, it's because the dataset is made for llama, where this is the default EOS token, but it's trained on Yi where EOS token is <|endoftext|>. I bet that's the same issue as with Nous Capybara. I haven't tried this fine tune yet. This model is a good base for fine-tuning.

FancyZhao

Nov 16, 2023

it's because the dataset is made for llama, where this is the default EOS token, but it's trained on Yi where EOS token is <|endoftext|>. I bet that's the same issue as with Nous Capybara.

I also think this is the most likely reason.

FancyZhao changed discussion status to closed Nov 16, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment