suyash2739
/

English_to_Hinglish_cmu_hinglish_dog

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

suyash2739 commited on May 24

Commit

33b8133

•

1 Parent(s): f66203b

Update README.md

Files changed (1) hide show

README.md +61 -0

README.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 language:
 - en
 license: apache-2.0
 tags:
 - text-generation-inference
@@ -20,6 +21,66 @@ datasets:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/6VsNF_rgDjXlubd4x8dMk.png)
 # Uploaded  model
 - **Developed by:** suyash2739

 ---
 language:
 - en
+- hi
 license: apache-2.0
 tags:
 - text-generation-inference
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/6VsNF_rgDjXlubd4x8dMk.png)
+# Colab Files:
+- Model_Use.ipynb file to use the model
+- Hinglish_train_lamma_3_8b_instruct_2_epoch.ipynb to see how the model is trained
+# Inference:
+```
+!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
+!pip install --no-deps xformers trl peft accelerate bitsandbytes
+```
+```python
+from unsloth import FastLanguageModel
+import torch
+max_seq_length = 2048
+dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
+load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "suyash2739/English_to_Hinglish_lamma_3_8b_instruct",
+    max_seq_length = max_seq_length,
+    dtype = dtype,
+    load_in_4bit = load_in_4bit,
+)
+```
+```python
+prompt = """Translate the input from English to Hinglish to give the response.
+### Input:
+{}
+### Response:
+{}"""
+```
+```python
+inputs = tokenizer(
+[
+  prompt.format(
+        """This is a fine-tuned Hinglish translation model using Llama 3.""", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+from transformers import TextStreamer
+text_streamer = TextStreamer(tokenizer)
+```
+```python
+_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 2048)
+## ye ek fine-tuned Hinglish translation model hai jisaka use Llama 3 kiya hai.
+```
 # Uploaded  model
 - **Developed by:** suyash2739