iashchak/ruGPT-3.5-13B-ggml · Can't load in LM Studio

netandreus

Sep 18, 2023

Trying to load in LMStudio Version 0.2.6 (0.2.6) without success :(

WaveCut

Sep 18, 2023

•

edited Sep 18, 2023

Удваиваю. @iashchak спасите, помогите!

iashchak

Owner Sep 18, 2023

•

edited Sep 18, 2023

Хм, выглядит подозрительно:(
Я несколько дней пытаюсь выгрузить несколько моделей в форматах ggml в добавок к ggjt, но все не получается.
Думаю в течении нескольких часов выгружу. И попробую эту LLMStudio (но вообще не уверен что это такое и с чем это есть)

Hmm, looks suspicious :(
I've been trying for several days to upload a few models in ggml formats in addition to ggjt, but it's not working out.
I think I'll upload it within a few hours. And I'll try this LLMStudio (although I'm not really sure what it is and what it's for).

iashchak

Owner Sep 18, 2023

•

edited Sep 18, 2023

iashchak changed discussion status to closed Sep 18, 2023

iashchak changed discussion status to open Sep 18, 2023

WaveCut

Sep 18, 2023

Это красивый и реально удобный гуй для мака и винды работающий на llama.cpp. Там недавно у последнего поменялся формат моделей на .gguf я так подозреваю, судя из расширений файла что это теперь memory-mappable модельки, чтобы не грузить все сразу. Вот было бы неплохо и такой формат завезти. Конвертер в репе первого

iashchak

Owner Sep 18, 2023

@WaveCut я пытался в gguf пару дней назад перегнать, но стало слишком впадлу все слои вручную переопределять (тк для gpt2 в оф репе ggml еще нет конвертера), а в rustformers поддержка gguf ещё даже не включена.

@WaveCut I tried to convert to gguf a couple of days ago, but it became too tedious to manually redefine all the layers (since there's no ggml converter for gpt2 in the official repo yet), and gguf support isn't even enabled in rustformers yet."

iashchak

Owner Sep 18, 2023

Well, since lm-studio looks proprietary I would propose you to switch to https://github.com/juliooa/secondbrain for example or any OSS (especially based on rustformers to be compatible)

netandreus

Sep 18, 2023

•

edited Sep 18, 2023

Well, since lm-studio looks proprietary I would propose you to switch to https://github.com/juliooa/secondbrain for example or any OSS (especially based on rustformers to be compatible)

I try to open it on SecondBrain, with errors too :(

@iashchak May be you think about fix it for running at LMStudio? Ну позязя :-)

iashchak

Owner Sep 18, 2023

•

edited Sep 18, 2023

@netandreus у меня нет пекарни где я бы стал запускать произвольный код, вообще доверия нет к этим ребятам.
Но как только смогу пережать модель в gguf - я точно сразу выложу её:-)
Вообще вот рил лежат ща много моделей пережатых, но чёт не сильно получается загрузить их в один репо.

@netandreus I don't have a sandbox where I'd be willing to run arbitrary code; I don't really trust these guys.
But as soon as I can convert the model to gguf, I'll definitely upload it right away :-)
Actually, I have a lot of converted models lying around, but for some reason, I'm having trouble uploading them to a single repo.

iashchak

Owner Sep 18, 2023

•

edited Sep 18, 2023

Поставил на выгрузку следующие модели:

Размер (GB)	Имя файла	Формат	Квантование	Версия
24.45	ruGPT-3.5-13B-f16.bin	GGML	NA	NA
6.91	ruGPT-3.5-13B-q4_0.bin	GGML	Q4	0
6.91	ruGPT-3.5-13B-q4_0-ggjt.bin	GGJT	Q4	0
7.67	ruGPT-3.5-13B-q4_1.bin	GGML	Q4	1
7.67	ruGPT-3.5-13B-q4_1-ggjt.bin	GGJT	Q4	1
8.44	ruGPT-3.5-13B-q5_0.bin	GGML	Q5	0
8.44	ruGPT-3.5-13B-q5_0-ggjt.bin	GGJT	Q5	0
9.20	ruGPT-3.5-13B-q5_1.bin	GGML	Q5	1
9.20	ruGPT-3.5-13B-q5_1-ggjt.bin	GGJT	Q5	1
13.01	ruGPT-3.5-13B-q8_0.bin	GGML	Q8	0
13.01	ruGPT-3.5-13B-q8_0-ggjt.bin	GGJT	Q8	0

В сумме, думаю, через пару часов оно запушиться в этот репозиторий.
Чуть позже обновлю доку + гляну как оно работает в разных тулзах для ggml формата.

Как только станет возможно сделать GGUF - сразу же добавлю этот формат с разными квантованиями.

Плюс было бы классно добавить оценку на то как сильно ухудшается качество генерации, в зависимости от варианта квантования, но пока доверия к классическим тестам нет.
В моем юскейсе я гоняю данную модель для генерации диалогов, мб есть крутой тест специфичный для этой задачи, надо будет посмотреть...

I've set the following models for upload:

Size (GB)	File Name	Format	Quantization	Version
24.45	ruGPT-3.5-13B-f16.bin	GGML	NA	NA
6.91	ruGPT-3.5-13B-q4_0.bin	GGML	Q4	0
6.91	ruGPT-3.5-13B-q4_0-ggjt.bin	GGJT	Q4	0
7.67	ruGPT-3.5-13B-q4_1.bin	GGML	Q4	1
7.67	ruGPT-3.5-13B-q4_1-ggjt.bin	GGJT	Q4	1
8.44	ruGPT-3.5-13B-q5_0.bin	GGML	Q5	0
8.44	ruGPT-3.5-13B-q5_0-ggjt.bin	GGJT	Q5	0
9.20	ruGPT-3.5-13B-q5_1.bin	GGML	Q5	1
9.20	ruGPT-3.5-13B-q5_1-ggjt.bin	GGJT	Q5	1
13.01	ruGPT-3.5-13B-q8_0.bin	GGML	Q8	0
13.01	ruGPT-3.5-13B-q8_0-ggjt.bin	GGJT	Q8	0

In total, I think it will be pushed to this repository in a couple of hours.
I'll update the documentation a bit later + check how it works in different tools for the GGML format.

As soon as it becomes possible to make GGUF, I'll immediately add this format with different quantizations.

It would also be great to add an evaluation of how much the generation quality deteriorates depending on the quantization option, but I don't trust the classical tests yet.
In my use case, I'm running this model for dialogue generation; maybe there's a cool test specific to this task, I'll have to look into it...

netandreus

Sep 19, 2023

•

edited Sep 19, 2023

Thanks a lot! We are looking forward to it!
Огромное спасибо! Очень ждем!

dmiales

Oct 2, 2023

@iashchak все еще GGUF не смогли сделать? А то kkobold и llama не хотят работать с ggml вариантами

iashchak

Owner Oct 2, 2023

@dmiales
Если в кратце, ситуация следующая:
В репозитории лежат файлы, в том числе q4_1.bin и q4_0.bin которые работают точно вместе с rustformers/llm и некоторыми инструментами из их рекомендаций.
Но это так называемые ggml v2/v3 насколько я понимаю.
Далее - я начал писать конвертер для gpt-2-like конвертера и инференса здесь - https://github.com/ggerganov/llama.cpp/pull/3407
Но в настоящий момент код ещё не рабочий.

В добавок я попытался получить уточнения от core-команды на тему конвертации здесь:
https://github.com/ggerganov/ggml/issues/220#issuecomment-1741606662

Но пока что не получил ответа.

@dmiales
In short, the situation is like this:

Files are in the repository, including q4_1.bin and q4_0.bin. They work well with rustformers/llm and some tools they recommend. These are, as far as I understand, ggml v2/v3.
Next, I started writing a converter for a GPT-2-like model and inference here. But the code is not working yet.

Additionally,

I tried to get clarification from the core team on the conversion topic here.
But I haven't received a response yet.

iashchak

Owner Oct 2, 2023

@dmiales Если что - можно использовать ctranformers для python тоже)))

dmiales

Oct 2, 2023

Спасибо за ваш труд!