Mayank082000
/

job_search_finetune

+---
+license: apache-2.0
+datasets:
+- Mayank082000/Multilingual_Sentences_with_Sentences
+language:
+- en
+- hi
+- pa
+library_name: adapter-transformers
+pipeline_tag: text-generation
+tags:
+- job-search
+- skill-development
+- foreign-counseling
+---
+# Fine-Tuned Llama 2 model for Multilingual Text Generation
+This repository contains adapters for the `adapter-transformers` library aimed at enabling multilingual text generation. It leverages datasets such as `siddeo99/sidtestfiverrmulti` and supports multiple languages including English, Hindi, and Punjabi.
+## Installation
+To install the necessary library, you can use pip:
+```python
+!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
+!pip install pyarrow
+```
+# This Section is for GPU enabled devices(For cpu code is below,skip the below code for cpu)
+## Import libraries
+```python
+import os
+import torch
+from datasets import load_dataset
+from transformers import (
+    AutoModelForCausalLM,
+    AutoTokenizer,
+    BitsAndBytesConfig,
+    HfArgumentParser,
+    TrainingArguments,
+    pipeline,
+)
+from peft import LoraConfig, PeftModel
+```
+## Configuration Parameters
+```python
+# The model that you want to train from the Hugging Face hub
+model_name = "siddeo99/job_search_category"
+################################################################################
+# QLoRA parameters
+################################################################################
+# LoRA attention dimension
+lora_r = 64
+# Alpha parameter for LoRA scaling
+lora_alpha = 16
+# Dropout probability for LoRA layers
+lora_dropout = 0.1
+################################################################################
+# bitsandbytes parameters
+################################################################################
+# Activate 4-bit precision base model loading
+use_4bit = True
+# Compute dtype for 4-bit base models
+bnb_4bit_compute_dtype = "float16"
+# Quantization type (fp4 or nf4)
+bnb_4bit_quant_type = "nf4"
+# Activate nested quantization for 4-bit base models (double quantization)
+use_nested_quant = False
+device_map = {"": 0}
+```
+### Loading Configuration
+```python
+# Load tokenizer and model with QLoRA configuration
+compute_dtype = getattr(torch, bnb_4bit_compute_dtype)
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=use_4bit,
+    bnb_4bit_quant_type=bnb_4bit_quant_type,
+    bnb_4bit_compute_dtype=compute_dtype,
+    bnb_4bit_use_double_quant=use_nested_quant,
+)
+# Check GPU compatibility with bfloat16
+if compute_dtype == torch.float16 and use_4bit:
+    major, _ = torch.cuda.get_device_capability()
+    if major >= 8:
+        print("=" * 80)
+        print("Your GPU supports bfloat16: accelerate training with bf16=True")
+        print("=" * 80)
+# Load base model
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=bnb_config,
+    device_map=device_map
+)
+model.config.use_cache = False
+model.config.pretraining_tp = 1
+# Load LLaMA tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+tokenizer.pad_token = tokenizer.eos_token
+tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training
+# Load LoRA configuration
+peft_config = LoraConfig(
+    lora_alpha=lora_alpha,
+    lora_dropout=lora_dropout,
+    r=lora_r,
+    bias="none",
+    task_type="CAUSAL_LM",
+)
+```
+## Text_generation
+```python
+prompt = "What is a large language model?"
+pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
+result = pipe(f"<s>[INST] {prompt} [/INST]")
+print(result[0]['generated_text'])
+```
+# This Section is for CPU only(Slower than GPU)
+## Import libraries
+```python
+from transformers import (
+    AutoModelForCausalLM,
+    AutoTokenizer,
+    pipeline,
+)
+from peft import LoraConfig
+```
+## Run the model on CPU
+```python
+model = AutoModelForCausalLM.from_pretrained(model_name)
+model.config.use_cache = False
+model.config.pretraining_tp = 1
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+tokenizer.pad_token = tokenizer.eos_token
+tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training
+# Load LoRA configuration
+peft_config = LoraConfig(
+    lora_alpha=lora_alpha,
+    lora_dropout=lora_dropout,
+    r=lora_r,
+    bias="none",
+    task_type="CAUSAL_LM",
+)
+# Run text generation pipeline with our next model
+prompt = "भारत से ऑस्ट्रेलिया में एक कार्य वीजा के लिए आवेदन करने के लिए क्या आवश्यकताएं हैं?"
+pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
+result = pipe(f"<s>[INST] {prompt} [/INST]")
+print(result[0]['generated_text'])
+```