ruslanmv
/

Meta-Llama-3.1-8B-Text-to-SQL-GPTQ

+---
+language:
+- en
+- it
+license: apache-2.0
+tags:
+- text-generation-inference
+- transformers
+- ruslanmv
+- llama
+- trl
+- sft
+---
+# Meta-Llama 3.1 8B Text-to-SQL GPTQ Model
+This repository provides a quantized 8-billion-parameter Meta-Llama model fine-tuned for text-to-SQL tasks. The model is optimized with GPTQ quantization for efficient inference. Below you'll find instructions to load, use, and fine-tune the model.
+## Model Details
+- **Model Size**: 8B
+- **Quantization**: GPTQ (4-bit)
+- **Languages Supported**: English, Italian
+- **Task**: Text-to-SQL generation
+- **License**: Apache 2.0
+## Installation Requirements
+Before using the model, ensure that you have the following dependencies installed. We recommend using the same versions to avoid any compatibility issues.
+```bash
+# Install the required PyTorch version with CUDA support (ensure CUDA 12.1 is installed)
+!pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
+# Install AutoGPTQ for quantized model handling
+!pip install auto-gptq --no-build-isolation
+# Install Optimum for model optimization
+!pip install optimum
+```
+After installing the dependencies, reset your instance to ensure everything works correctly.
+## Loading the Model
+To load the quantized Meta-Llama 3.1 model and use it for text-to-SQL tasks, use the following Python code:
+```python
+from transformers import AutoTokenizer, pipeline
+from auto_gptq import AutoGPTQForCausalLM
+import torch
+# Define the Alpaca-style prompt template
+alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+{}
+### Input:
+{}
+### Response:
+"""
+# Model directory and tokenizer
+quantized_model_dir = "meta-llama-8b-quantized-4bit"  # Path where quantized model is saved
+tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
+# Load the quantized model
+model = AutoGPTQForCausalLM.from_quantized(
+    quantized_model_dir,
+    device_map="auto",  # Automatically map the model to the available device (GPU or CPU)
+    torch_dtype=torch.float16,  # Ensure FP16 for efficiency
+    use_safetensors=True  # If you saved the model using safetensors format, set this to True
+)
+# Set up the text generation pipeline without specifying the device
+pipeline = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer
+)
+# Function to generate SQL query from input text using the Alpaca prompt
+def generate_sql(input_text):
+    # Format the prompt
+    prompt = alpaca_prompt.format(
+        "Provide the SQL query",
+        input_text
+    )
+    # Generate the response using the pipeline
+    generated_text = pipeline(
+        prompt,
+        max_length=200,
+        eos_token_id=tokenizer.eos_token_id
+    )[0]["generated_text"]
+    # Clean the output by removing the prompt and any extra newlines
+    cleaned_output = generated_text.replace(prompt, '').strip()
+    return cleaned_output
+# Example usage
+italian_input = "Seleziona tutte le colonne della tabella table1 dove la colonna anni è uguale a 2020"
+sql_query = generate_sql(italian_input)
+print(sql_query)
+```
+## Example Usage
+The example script shows how to generate SQL queries from natural language text. Simply provide a request in Italian or English, and the model will generate an appropriate SQL query.
+Example input:
+```python
+italian_input = "Seleziona tutte le colonne della tabella table1 dove la colonna anni è uguale a 2020"
+sql_query = generate_sql(italian_input)
+print(sql_query)
+```
+Example output:
+```sql
+SELECT * FROM table1 WHERE anni = 2020;
+```
+## Model Tags
+- **text-generation-inference**
+- **transformers**
+- **llama**
+- **trl**
+- **sft**
+## License
+This model is released under the [Apache License 2.0](LICENSE).