berkeley-nest/Starling-LM-7B-alpha

Nov 28, 2023

I noticed that the model is reapeting itself and most of the time write more than answer need, I would like to know the ideal parameters you suggest for this incredible model?

Example and I had to stop it

>>> """give me the output of this code
... s = "She enrolled in the course at the university."
... doc = nlp(s)
... [(t.text, t.dep_, t.head.text) for t in doc]

To execute this Python code snippet, you need to have the spacy library installed, which is a popular NLP (Natural Language Processing) library. Here's the output of the provided code using the English
model of Spacy:

import spacy

nlp = spacy.load("en_core_web_sm")  # Load the English language model from Spacy

s = "She enrolled in the course at the university."  # Input sentence
doc = nlp(s)  # Analyze the input sentence using the loaded model

output = [(t.text, t.dep_, t.head.text) for t in doc]  # Extract relevant information from the analyzed document
print(output)

The output of the provided code would be:

[('She', 'nsubj', 'enrolled'), ('enrolled', 'ROOT', 'enrolled'), ('in', 'prep', 'enrolled'), ('the', 'det', 'course'), ('course', 'dobj', 'enrolled'), ('at', 'prep', 'university'), ('university', 'pobj','enrolled')]

Here's a brief explanation of the output:

('She', 'nsubj', 'enrolled'): The word "She" is the subject (nsubj) of the verb "enrolled."
('enrolled', 'ROOT', 'enrolled'): The main action in this sentence, which is "enrolled," is marked as the root node (ROOT) of the dependency tree.
('in', 'prep', 'enrolled'): The preposition "in" provides preparation information for the verb "enrolled."
('the', 'det', 'course'): The definite article "the" is modifying the noun "course."
('course', 'dobj', 'enrolled'): The direct object (dobj) of the verb "enrolled" is the noun "course."
('at', 'prep', 'university'): The preposition "at" provides preparation information for the noun "university."
('university', 'pobj', 'enrolled'): The part of speech (pobj) of the verb "enrolled" is the noun "university," which is related to the action of enrollment in this context.

The dependency parsing helps understand the syntactic relationships between words in a sentence, such as subjects, objects, and other modifying elements. In this case, the code extracts relevant
information like word text (t.text), their grammatical role (t.dep_), and the head word they are dependent on (t.head.text).

Note that you can use more advanced Spacy models if needed, such as en_core_web_lg, which includes additional linguistic features at the cost of a larger model size. You can find more information about
these models in the Spacy documentation: https://explore.spacy.io/usage/models

Keep in mind that running this code requires having the spacy library installed, which you can install using pip (Python's package manager):

pip install spacy

Once installed, you can use Spacy for various NLP tasks like named entity recognition, part-of-speech tagging, dependency parsing, and more.

Here are some additional resources to learn more about Spacy:

Spacy's official tutorials: https://explore.spacy.io/
Spacy's documentation: https://spacy.io/usage
A comprehensive guide to Spacy by Real Python: https://www.realpython.com/spacy-nlp-python/
An introduction to Spacy and NLP with Python by Data Science Central: https://datascience.community/courses/an-introduction-to-spacy-and-nlp-with-python

These resources will help you dive deeper into the world of Spacy and NLP (Natural Language Processing).

I hope this helps! Don't hesitate to ask if you have any more questions or need clarification on any part of the answer.

Less

Here are some additional resources to learn more about Spacy:

1. Spacy's official tutorials: <https://explore.spacy.io/>
2. Spacy's documentation: <https://spacy.io/usage>
3. A comprehensive guide to Spacy by Real Python: <https://www.realpython.com/spacy-nlp-python/>
4. An introduction to Spacy and NLP with Python by Data Science Central: <https://datascience.community/courses/an-introduction-to-spacy-and-nlp-with-python>

These resources will help you dive deeper into the world of Spacy and NLP (Natural Language Processing).

I hope this helps! Don't hesitate to ask if you have any more questions or need clarification on any part of the answer.

Less

Here are some additional resources to learn more about Spacy:

Spacy's official tutorials: https://explore.spacy.io/
Spacy's documentation: https://spacy.io/usage
A comprehensive guide to Spacy by Real Python: https://www.realpython.com/spacy-nlp-python/
An introduction to Spacy and NLP with Python by Data Science Central: https://datascience.community/courses/an-introduction-to-spacy-and-nlp-with-python

These resources will help you dive deeper into the world of Spacy and NLP (Natural Language Processing).

I hope this helps! Don't hesitate to ask if you have any more questions or need clarification on any part of the answer.


Here are some additional resources to learn more about Spacy:
^C

banghua

Berkeley-Nest org Nov 28, 2023

Emm that's an interesting question. May I ask if you're using the default chat template? (GPT4 Correct User: Hello! GPT4 Correct Assistant:) The model with a wrong chat template will perform very poorly.
Actually for coding, the chat template can be changed for the coding template in the model card as well. Not sure how much that could help since we didn't do RLHF on that template.

Another possibility is to try a lower temperature, our testing suggests that temperature 0 might be slightly better than 0.7.

You may want to first try a bit on chat.lmsys.org. If you also observe the same issue there, then it'll just be the issue with the model.

But still yes the model will output unnecessary content in some cases. We're still working to fix it in the next version. Please stay tuned!

eramax

Nov 29, 2023

I tried with the suggested prompt format and had the same issue.

DefamationStation

Dec 3, 2023

•

edited Dec 3, 2023

I'm running into a similar issue I think and i've got the correct chat template, it keeps going and going without stopping

banghua

Berkeley-Nest org Dec 3, 2023

Please follow the default chat template. The model is not finetuned on any system template, so please keep system message and prefix / suffix empty. For user message suffix, please use <|end_of_turn|>GPT Correct Assistant:

Also it seems that the model you used here is not updated with the correct tokenizer, so it has <0x0A> rather than line change.

Here is what I get with the same prompt + correct template from lmsys:

lixbo

Dec 3, 2023

@frenzygr I have the same issue in LMstudio, but it works fine in KoboldCpp, btw you can see lmstudio detecting the model as a Starcoder model, so gpu won't be supported that way.

DefamationStation

Dec 3, 2023

•

edited Dec 3, 2023

@frenzygr I have the same issue in LMstudio, but it works fine in KoboldCpp, btw you can see lmstudio detecting the model as a Starcoder model, so gpu won't be supported that way.

Does it detect it as a Starcoder model for you too or is there a setting I can chage?

Tried it in KoboldCpp but unless I limit the generation tokens it still goes on and on until it reaches the limit

DefamationStation

Dec 3, 2023

Please follow the default chat template. The model is not finetuned on any system template, so please keep system message and prefix / suffix empty. For user message suffix, please use <|end_of_turn|>GPT Correct Assistant:

Also it seems that the model you used here is not updated with the correct tokenizer, so it has <0x0A> rather than line change.

Here is what I get with the same prompt + correct template from lmsys:

I believe I have the correct template now but it still keeps going indefinitely

lixbo

Dec 4, 2023

•

edited Dec 4, 2023

Tried it in KoboldCpp but unless I limit the generation tokens it still goes on and on until it reaches the limit

I just tried it in LMStudio and I got the same issue as you, the problem is not the prompt format it's from LMStudio certainly, I tested it in KoboldCpp and it worked fine if I give it large output context.
Also, verify that your model is updated, just check if the SHA256 code match from here, if it doesn't redownload the model.

Does it detect it as a Starcoder model for you too or is there a setting I can chage?

It's detecting the "star" in the model name and folder, try to change any letter and see.

DefamationStation

Dec 7, 2023

•

edited Dec 7, 2023

Tried it in KoboldCpp but unless I limit the generation tokens it still goes on and on until it reaches the limit

I just tried it in LMStudio and I got the same issue as you, the problem is not the prompt format it's from LMStudio certainly, I tested it in KoboldCpp and it worked fine if I give it large output context.
Also, verify that your model is updated, just check if the SHA256 code match from here, if it doesn't redownload the model.

Does it detect it as a Starcoder model for you too or is there a setting I can chage?

It's detecting the "star" in the model name and folder, try to change any letter and see.

I still get the same behaviour, can you please paste a couple small screenshots of all your settings before you lunch the kobold app and for the model itself?

I appreciate it!

lixbo

Dec 7, 2023

I load it with the default configuration then change one thing, Context size to 4096.
After it loads I used the Default preset, and in Start Seq put: "<|end_of_turn|>GPT4 User:" and in End Seq: "<|end_of_turn|>GPT4 Assistant:" without the quotes.

Did you make sure you have the updated model?

DefamationStation

Dec 7, 2023

•

edited Dec 7, 2023

I load it with the default configuration then change one thing, Context size to 4096.
After it loads I used the Default preset, and in Start Seq put: "<|end_of_turn|>GPT4 User:" and in End Seq: "<|end_of_turn|>GPT4 Assistant:" without the quotes.

Did you make sure you have the updated model?

thanks! got it working in Kobolt but still face the same issue in LM studio! ill mess around with it a bit more

EDIT: Got it working on LM studio as well!

EDIT2: Nvm it got back to spewing entire walls of text without changing any parameters.

berkeley-nest
/

Starling-LM-7B-alpha

best model params