Lohith9923 commited on
Commit
dc886fd
1 Parent(s): e8d5172

Upload project.md

Browse files
Files changed (1) hide show
  1. project.md +87 -0
project.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # English to Hindi Text Translation using Transformers
2
+
3
+ This project showcases a simple text translation model that translates English text to Hindi using the Hugging Face Transformers library. The model utilizes pre-trained sequence-to-sequence architecture for accurate and efficient translation.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Project Overview](#project-overview)
8
+ - [Installation](#installation)
9
+ - [Usage](#usage)
10
+ - [Model Training and Dataset](#model-training-and-dataset)
11
+ - [Model Testing and Deployment](#model-testing-and-deployment)
12
+ - [User Interface](#user-interface)
13
+ - [Challenges Faced](#challenges-faced)
14
+ - [Contributions](#contributions)
15
+
16
+ ## Project Overview
17
+
18
+ Text translation is an essential task in natural language processing, and this project aims to provide a practical example of building and deploying a translation model. The project covers the following aspects:
19
+
20
+ - Data preprocessing: Tokenization and dataset preparation.
21
+ - Model training: Training a sequence-to-sequence model for English-to-Hindi translation.
22
+ - Model testing: Translating text using the trained model.
23
+ - User interface: Creating a user-friendly interface for text translation.
24
+
25
+ ## Installation
26
+
27
+ To run this project, you'll need the following dependencies:
28
+
29
+ - Python 3.x
30
+ - TensorFlow
31
+ - Hugging Face Transformers
32
+ - Datasets library
33
+ - Gradio
34
+
35
+ You can install the required libraries using the following shell command:
36
+
37
+ ```shell
38
+ pip install datasets transformers[sentencepiece] tensorflow gradio -q
39
+ ```
40
+
41
+ ## Usage
42
+ Checkout the app [here](https://huggingface.co/spaces/Lohith9923/En-Hi-Translation) where you need to give english sentences or text in input textbox and output is translated text or sentence in Hindi.
43
+
44
+ ## Model Training and Dataset
45
+ For training the text translation model.
46
+ You can checkout the pre-trained model from [here](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fhuggingface.co%2FHelsinki-NLP%2Fopus-mt-en-hi) and Dataset from [here](https://huggingface.co/datasets/cfilt/iitb-english-hindi/viewer/cfilt--iitb-english-hindi).
47
+ - First Download the pre-trained model using **transformers** library in python.
48
+ - Load the Dataset **cfilt/iitb-english-hindi** using **Datasets** library in python.
49
+ - Initialized the model, tokenizer, and preprocessing function.
50
+ - Tokenized the dataset and prepared the training and validation data.
51
+ - Compiled the model with the optimizer(**Adam**) with required parameters.
52
+ - Trained the model for the desired number of epochs.
53
+
54
+ ## Model Testing and Deployment
55
+ To test the trained model and deploy a user interface:
56
+
57
+ - Saved the trained model at a preferred location.
58
+ - Loaded the model from the location and tokenizer for testing.
59
+ - Translated sample input text using the model.
60
+ - Deployed a Gradio interface for user-friendly translation.
61
+
62
+ ## User Interface
63
+
64
+ The Gradio interface provides an interactive way to translate English text to Hindi. To use the interface:
65
+
66
+ - Run the project and navigate to the specified URL.
67
+ - Enter English text in the input box.
68
+ - Checkout the translated Hindi text in the output box.
69
+
70
+ ## Challenges Faced
71
+
72
+ - Surfed through lot of resources in google and other platforms for best dataset for my project.
73
+ - Spent a lot of time gathering the correct resources for understanding about transformers, LLM's and gradio.
74
+
75
+ ## Contributions
76
+ Contributions to this project are welcome! Here are some ways you can contribute:
77
+
78
+ - Improve the model's translation quality and performance.
79
+ - Enhance the user interface for a better user experience.
80
+ - Add support for more languages and translation directions.
81
+
82
+ To contribute, follow these steps:
83
+
84
+ - Fork this repository.
85
+ - Create a new branch for your feature or bug fix.
86
+ - Commit your changes and push them to your fork.
87
+ - Open a pull request with a detailed description of your changes.