File size: 1,125 Bytes
8e8bc45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
base_model:
- facebook/bart-large
pipeline_tag: translation
library_name: transformers
tags:
- code
---
Hindi to Bengali Translation using BART

Overview

This project fine-tunes the BART model for Hindi-to-Bengali translation using the Hind-Beng-5k dataset. 
The model is trained using the Hugging Face transformers library with PyTorch.

Dataset

We use the Hind-Beng-5k dataset from Hugging Face, which contains parallel Hindi and Bengali text samples.
Dataset: sudeshna84/Hind-Beng-5k

Model

The model used for translation is facebook/bart-large.
It is fine-tuned for sequence-to-sequence translation from Hindi to Bengali using the BART architecture.

Installation
To run the project, install the required dependencies:
pip install transformers datasets torch

Preprocessing
The dataset is preprocessed by tokenizing the Hindi input text and Bengali target text using the BART tokenizer.

Training
The model is trained using the Trainer API from Hugging Face with the following parameters:
Batch size: 8
Learning rate: 2e-5
Epochs: 3
Weight decay: 0.01


Credits Tag
Sudeshna Sani- https://huggingface.co/sudeshna84