Inference:

!pip install -q "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -q --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes
from unsloth import FastLanguageModel
import torch
max_seq_length = 512
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Hinglish-Project/llama-3-8b-English-to-Hinglish",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
def pipe(prompt):
  alpaca_prompt = """### Instrucion: Translate given text to Hinglish Text:

### Input:
{}

### Response:
"""

  inputs = tokenizer(
      [
          alpaca_prompt.format(prompt),
      ], return_tensors = "pt").to("cuda")

  outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True)
  raw_text = tokenizer.batch_decode(outputs)[0]
  return raw_text.split("### Response:\n")[1].split("<|end_of_text|>")[0]
text = "This is a fine-tuned Hinglish translation model using Llama 3."
pipe(text)
## yeh ek fine-tuned Hinglish translation model hai jisme Llama 3 ka use kiya gaya hai.

Uploaded model

  • Developed by: Hinglish-Project
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-8b-bnb-4bit

This Llama3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
277
Safetensors
Model size
8.03B params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Hinglish-Project/llama-3-8b-English-to-Hinglish

Base model

unsloth/llama-3-8b
Quantized
(23)
this model
Quantizations
1 model

Dataset used to train Hinglish-Project/llama-3-8b-English-to-Hinglish

Spaces using Hinglish-Project/llama-3-8b-English-to-Hinglish 5