Does not work

#2
by buddyroo30 - opened

I just tried this new 70b model on a machine with 8 A100s using the inference.py code (except changing model name to sqlcoder-70b-alpha instead of the 34b one). The model loaded and executed, but it returned garbage output (as well as some errors about model weights); see below for example. Am I doing anything wrong? Thanks for your help, really hope to get this working as its performance looks very good.

python3 inference.py -q "list all sales persons"

Loading a model and generating a SQL query for answering your question...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:30<00:00, 1.05s/it]
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at defog/sqlcoder-70b-alpha and are newly initialized: ['lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
востомінemitुPackageVector Canada Lakãoprü□ didnEditText Venezuela_+ приняFre Venezuela приня приня приняемемем Miguelруosi Miguel vere отноrita Miguel Miguel Miguel dynamicrix duas Miguel duas duas duas Serialrix duas губер Müllerunsруugust duas shed codes duas topological duasividual duas Miguel duas topological duas topological duas topological Root Kurz duas topological губер duas topological губер губерATCH Archivesруeld duas topological topological topological topological topological topological topologicalATCH topological topological topological topological topological topological topological topologicaloncosi topological topological topologicalposes duas topological topological topological topological Root Root Root Root Root губер topological topological topological topological topological Victoria topological topological topological topological topological topological topological São topological topological topological topological Victoria topological topological topological topological topological topological Victoria topological topological topological topological Victoria topological topological topological topological topological Victoria topological topological topological topological topological topological topological topological São topological topological topological topological topological topological Victoria topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological Victoria topological topological topologicaloncDrawable приня topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topologicaloncDrawable topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topologicaloncDrawable topological topological topological topological topological topological topologicaloncDrawable topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topological topologicalруwave splendid Miguel topological topological topological topological topological topological topological topological кіliament Müller Root губер topological topological topological topological topological topological topological topological

Are you hosting the model on Huggingface? I don't see an inference.py file in the files listed.

No, I just cloned the sqlcoder GitHub repo onto my GPU server and executed it's inference.py script there, after modifying it to use sqlcoder-70b-alpha and this generated the erroneous output in my first comment above. I'm not hosting or running anything here on Huggingface. I just want to get it to work correctly on my own server.

And note that I have successfully been able to run sqlcoder-34b-alpha on the same GPU server in the same way and got correct SQL output generated. So it seems to be some issue with the model or my way of executing it.

Defog.ai org

Hi there, we’re looking into this — trying to replicate now on your setup. Thanks!

Defog.ai org

We could replicate this and can confirm this is an issue. One possible reason is that they weights might have somehow gotten corrupted during upload, or there might be CUDA compability issues. We will push a fix in the next few hours.

Thanks very much for reporting!

Defog.ai org

Hi there, we discovered a bizzare bug where the model's lm_head.weight was not uploaded to HF in the upload process. This is causing many integrations to break, and the model uploaded here is producing gibberish results.

Fix coming soon – hopefully in the next hour

Defog.ai org

Fixed with a reupload of the model weights! Apologies for the issue.

You'll unfortunately have to re-download the model weights (run rm ~/.cache/huggingface/hub/models--defog--sqlcoder-70b-alpha first). Please let me know if you run into other issues!

rishdotblog changed discussion status to closed

Awesome, thanks for the quick fix, I will test this out a little later today and post back here confirming it works for me (or if still issues).

Okay, I just re-downloaded the weights and tried inference.py again, and it works fine now, I was able to generate correct SQL queries for a few English questions. Thanks again for the quick fix!

Defog.ai org

Fantastic, glad to hear that!

Sign up or log in to comment