DheivaCodes commited on
Commit
4d9c678
Β·
verified Β·
1 Parent(s): 84eaec5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -10
README.md CHANGED
@@ -1,13 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Multilingual Translator
3
- emoji: ⚑
4
- colorFrom: blue
5
- colorTo: pink
6
- sdk: gradio
7
- sdk_version: 5.38.0
8
- app_file: app.py
9
- pinned: false
10
- short_description: Multilingual Translator with Semantic Search and BLEU Evalua
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🌍 Multilingual Translator + Semantic Search (Enhanced)
2
+
3
+ This project is a smart multilingual translator web app that offers:
4
+
5
+ - βœ… **Automatic language detection**
6
+ - 🌐 **High-quality translation** between Indian and foreign languages
7
+ - 🧠 **Semantic search** to find similar Sanskrit-based concepts
8
+ - πŸ“Š **Optional BLEU score evaluation** (with human reference)
9
+ - πŸ“„ **Downloadable report** summarizing the output
10
+ - 🚫 **Input length handling** to avoid translation errors
11
+
12
+ > Developed using Hugging Face Transformers, Sentence Transformers, FAISS, and Gradio β€” and deployable to Hugging Face Spaces.
13
+
14
+ ---
15
+
16
+ ## ⚠️ Input Limit Notice
17
+
18
+ Please enter **up to 3 lines** or **2000 characters** maximum.
19
+
20
+ - If input is too long, the app will show an error and skip translation.
21
+
22
  ---
23
+
24
+ ## πŸš€ Live Demo
25
+
26
+ πŸ”— [Click here to try the app on Hugging Face Spaces](https://huggingface.co/spaces/jeevitha-app/Multilingual-translator)
27
+
 
 
 
 
28
  ---
29
 
30
+ ## πŸ”§ Features
31
+
32
+ | Feature | Description |
33
+ |--------|-------------|
34
+ | **Language Detection** | Auto-identifies input language using `xlm-roberta-base-language-detection` |
35
+ | **Translation** | Uses Facebook’s `NLLB-200-distilled-600M` model |
36
+ | **Semantic Search** | Finds similar Sanskrit concepts using Sentence Transformers + FAISS |
37
+ | **BLEU Score** | Optional evaluation metric (if human reference is provided) |
38
+ | **Semantic Plot** | Horizontal bar chart for top 3 semantic similarity scores |
39
+ | **Download Report** | Creates a `.txt` file (includes all outputs + BLEU score) |
40
+ | **Error Handling** | Graceful messages for empty or long input |
41
+
42
+ ---
43
+
44
+ ## 🌐 Supported Languages
45
+
46
+ | Code | Language |
47
+ |------------|-----------|
48
+ | eng_Latn | English |
49
+ | hin_Deva | Hindi |
50
+ | tam_Taml | Tamil |
51
+ | tel_Telu | Telugu |
52
+ | san_Deva | Sanskrit |
53
+ | fra_Latn | French |
54
+ | spa_Latn | Spanish |
55
+ | deu_Latn | German |
56
+ | jpn_Jpan | Japanese |
57
+ | zho_Hans | Chinese |
58
+ | arb_Arab | Arabic |
59
+
60
+ ---
61
+
62
+ ## πŸ“„ Downloadable Report
63
+
64
+ The app generates a `.txt` file containing:
65
+
66
+ - Detected source language
67
+ - Translated output
68
+ - Semantic matches (with similarity scores)
69
+ - BLEU score (if a human reference translation is given)
70
+
71
+ ---
72
+
73
+ ## 🚧 Future Enhancements
74
+
75
+ - πŸŽ™οΈ Speech-to-text input support
76
+ - πŸ”Š Text-to-speech audio output
77
+ - πŸ“Έ OCR: Translate text from uploaded images
78
+ - πŸ†• Add more Indian languages and transliteration features
79
+
80
+ ---
81
+
82
+ ## πŸ‘©β€πŸ’» Author
83
+
84
+ **Jeevitha Meenakshisundaram**
85
+ M.Sc. Data Science, SASTRA University
86
+
87
+ ---
88
+
89
+ ## πŸ“œ License
90
+
91
+ This project is licensed under the **MIT License**.
92
+