Any chat/text generation model with Slovak language?

#1
by MrDevolver - opened

Hey, I see you're toying with Slovak language. I understand it's a bit of a niche, but do you happen to have a text generation / chat model with this language? I could definitely use one. I'm looking for an AI assistant with Slovak language support in ggml format and it looks like there just isn't any available around here, or at least I haven't found one. πŸ™ There are some models which seem to have limited data from this language, but it's usually mixed with other languages, often producing uncomprehensible results.

Good Q, I can make one, if you'd like. I haven't so far because of the lackluster response to my Slovak BERT model.

Good Q, I can make one, if you'd like. I haven't so far because of the lackluster response to my Slovak BERT model.

That would be awesome if you could, thank you! I'm just a user, not a creator in this particular field, so I depend on the creations of others. As for this particular BERT model, if I may ask, what exactly is it good for? You know, not all users here are also creators, some of us don't really know what's what and we are exploring, figuring things out. πŸ™‚

So SlovenBERTcina is a pre-trained Slovak/English Language model that will be best at Natural Language Understanding (NLU) tasks like classification or sentiment analysis. You can use it in a limited sense for things like translation or generation as well, but because it's only pre-trained, you'll have to gather some data showcasing the task you want to perform and having the model train on those tasks before it will just do them.

For text generation/chat, I'll likely use a decoder-only or a full transformer pre-trained on the exact same dataset. They'll need to be fine-tuned as well on whatever task you'll want them to perform, but they'll be more made for those tasks.

If you want to use SlovenBERTcina for text generation, just take the start of your text and add a

Thanks for explanation and the tip. I would need a ggml model for my purpose though, I need to use it with GPT4All program and it requires ggml models. GPT4All offers an easy way to feed the AI with data inside documents such as text files or pdf files, so it can search through the documents and provide the information I'm looking for in an interactive way, just like a real assistant would do and since most of the documents I would need to work with are written in Slovak language, having an AI model trained on Slovak language would be a great help, another reason why I need a Slovak model is that I need this mostly for a person who doesn't speak english that well. πŸ˜€

Sign up or log in to comment