--- library_name: peft base_model: mistralai/Mixtral-8x7B-v0.1 license: apache-2.0 datasets: - knowrohit07/know_sql language: - en --- ## SQL-Converter Mixtral 8x7B v0.1 **Convert Natural Language to SQL** ### Overview Mixtral-8x7B-sql-ft-v1 is fine-tuned from Mixtral 8x7B to convert natural language to SQL queries. ### Base Model mistralai/Mixtral-8x7B-v0.1 ### Fine-Tuning - **Dataset**: 5,000 natural language-SQL pairs. ### Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch base_model_id = 'mistralai/Mixtral-8x7B-v0.1' adapter_id = 'sharadsin/Mixtral-8x7B-sql-ft-v1' bnb_config = BitsAndBytesConfig( load_in_4bit = True, bnb_4bit_use_double_quant = True, bnb_4bit_compute_dtype = torch.bfloat16, bnb_4bit_quant_type = "nf4", ) base_model = AutoModelForCausalLM.from_pretrained( base_model_id, quantization_config = bnb_config, device_map = "auto", trust_remote_code = True, ) tokenizer = AutoTokenizer.from_pretrained(base_model_id, add_bos_token = True, trust_remote_code = True) ft_model = PeftModel.from_pretrained(base_model, adapter_id) eval_prompt= """SYSTEM: Use the following contextual information to concisely answer the question. USER: CREATE TABLE EmployeeInfo (EmpID INTEGER, EmpFname VARCHAR, EmpLname VARCHAR, Department VARCHAR, Project VARCHAR,Address VARCHAR, DOB DATE, Gender CHAR) === Write a query to fetch details of employees whose EmpLname ends with an alphabet 'A' and contains five alphabets? ASSISTANT:""" model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda") ft_model.eval() with torch.inference_mode(): print(tokenizer.decode(ft_model.generate(**model_input, max_new_tokens=70,top_k=4, penalty_alpha = 0.6, repetition_penalty=1.15)[0], skip_special_tokens= False)) ``` ### Limitations - Less accurate with very complex queries. - Generates extra gibberish content after providing the answers. ### Framework versions - PEFT 0.7.1