File size: 4,586 Bytes
b7a15d5
 
 
4803713
b7a15d5
 
06e79b4
 
 
 
b7a15d5
 
 
 
 
85a351f
 
 
06e79b4
b7a15d5
4803713
 
b7a15d5
550273c
3414fbe
550273c
 
165fc78
550273c
165fc78
550273c
165fc78
 
 
 
 
 
 
 
550273c
 
9336330
 
 
 
05ab647
64e90f4
 
26d980e
 
 
 
 
4803713
02204e0
4803713
 
 
 
 
 
 
 
 
 
 
 
 
 
e3355a5
4803713
 
 
 
 
 
 
 
 
c6668c6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4803713
 
 
c6668c6
 
e3355a5
 
 
 
 
4803713
e3355a5
 
 
 
 
 
 
 
 
 
 
 
4803713
 
 
13642af
 
b7a15d5
 
 
 
 
 
 
 
56f6534
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
language:
- en
- hi
license: apache-2.0
tags:
- hinglish
- translation
- english to hinglish
- language translation
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- en to hi
- multilingual
- hindi codemix
- opensource
base_model: unsloth/llama-3-8b-Instruct-bnb-4bit
datasets:
- suyash2739/News_Hinglish_English
---

[!["Buy Me A Coffee"](https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png)](https://buymeacoffee.com/suyash008)

# Dataset

This is a dataset curated and made by me.

You can buy it here.

(https://buymeacoffee.com/suyash008/e/268592)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/Qdr5bXsvsjPNF0DClmgus.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/MCp_zRz310ln004mnXKQh.png)




# My Linkedin
Linkedin- [https://www.linkedin.com/in/suyash-ag/ ]
Github- [https://github.com/Suyash018 ]

# Project - A English to Hinglish Language Translater.
This Project aims to develop a high-performance language translation model capable of translating standard English to Hinglish (a blend of Hindi and English commonly used in informal communication in India).

# Loss Curve


![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/31vSqxldRSGEDNGwrJbFy.png)


# Inference / How to use the model:

```
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes
```

```python
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "suyash2739/English_to_Hinglish_fintuned_lamma_3_8b_instruct",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
```


```python

def pipe(text):
  prompt = """Translate the input from English to Hinglish to give the response.

### Input:
{}

### Response:
"""
  inputs = tokenizer(
      [
          prompt.format(text),
      ], return_tensors = "pt").to("cuda")

  outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True)
  raw_text = tokenizer.batch_decode(outputs)[0]
  return raw_text.split("### Response:\n")[1].split("<|eot_id|>")[0]
```

```python
text = "This is a fine-tuned Hinglish translation model using Llama 3." # INPUT
print(pipe(text))
## Yeh ek fine-tuned Hinglish translation model hai jo Llama 3 ka istemal karta hai.
```


# Comaprision

- English
```python
English = """Finance Minister Nirmala Sitharaman said, "There used to be a poverty index...a human development index and all of them continue, but today what is keenly watched is VIX, the volatility index of the markets." Stability of the government is important for markets to be efficient, she stated. PM Narendra Modi's third term will make markets function with stability, she added."""
```
- Gpt 4o
```python
Gpt 4o = """ Finance Minister Nirmala Sitharaman ne kaha, "Pehle ek poverty index hota tha...ek human development index hota tha aur yeh sab ab bhi hain, lekin aaj jo sabse zyada dekha ja raha hai, woh hai VIX, jo markets ka volatility index hai." Unhone kaha ki sarkar ki stability markets ke efficient hone ke liye zaroori hai. PM Narendra Modi ka teesra term markets ko stability ke saath function karne mein madad karega, unhone joda."""
```

- My model (Finetuned LLama model)
```python
LLama model = Finance Minister Nirmala Sitharaman ne kaha, "Pehle ek poverty index hota tha... ek human development index hota tha aur sab kuch ab bhi chal raha hai, lekin aaj jo kaafi zyada dekha ja raha hai, woh VIX hai, jo markets ki volatility ka index hai." Unhone kaha ki markets ke liye sarkar ki stability zaroori hai. PM Narendra Modi ke teesre term se markets stability ke saath function karenge, unhone joda.
```


![image/png](https://cdn-uploads.huggingface.co/production/uploads/65187b234965add2b08b2990/Rc3nlfnSVwu1dnzfxYb-Y.png)

# Uploaded  model

- **Developed by:** suyash2739
- **License:** apache-2.0
- **Finetuned from model :** unsloth/llama-3-8b-Instruct-bnb-4bit

This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)