File size: 2,312 Bytes
2d78108
 
 
 
 
 
 
 
 
 
 
 
 
9d452e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d78108
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
language:
- en
- hi
- te
- ta
- mr
- kn
- ml
pipeline_tag: translation
tags:
- indic_trans_v2
---
# IndicTrans2 HF Compatible Models

In this section, we provide details on how to use our [IndicTrans2](https://github.com/AI4Bharat/IndicTrans2) models which were originally trained with the [fairseq](https://github.com/facebookresearch/fairseq) to [HuggingFace transformers](https://huggingface.co/docs/transformers/index) for inference purpose. Our scripts for HuggingFace compatible models are adapted from [M2M100 repository](https://github.com/huggingface/transformers/tree/main/src/transformers/models/m2m_100).


### Setup

To get started, follow these steps to set up the environment:

```
# Clone the github repository and navigate to the project directory.
git clone https://github.com/AI4Bharat/IndicTrans2
cd IndicTrans2

# Install all the dependencies and requirements associated with the project for running HF compatible models.
source install.sh
```

> Note: The `install.sh` script in this directory is specifically for running HF compatible models for inference.


### Models

| Model    | 🤗 HuggingFace Checkpoints        |
|----------|-----------------------------------|
| Preprint En-Indic | [ai4bharat/indictrans2-en-indic-1B](https://huggingface.co/ai4bharat/indictrans2-en-indic-1B) |
| Preprint Indic-En | [ai4bharat/indictrans2-indic-en-1B](https://huggingface.co/ai4bharat/indictrans2-indic-en-1B) |


### Inference

With the conversion complete, you can now perform inference using the HuggingFace Transformers. 

You can start with the provided `example.py` script and customize it for your specific translation use case:

```bash
python3 example.py
```

Feel free to modify the `example.py` script to suit your translation needs.

### Citation

```
@article{ai4bharat2023indictrans2,
  title   = {IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages},
  author  = {AI4Bharat and Jay Gala and Pranjal A. Chitale and Raghavan AK and Sumanth Doddapaneni and Varun Gumma and Aswanth Kumar and Janki Nawale and Anupama Sujatha and Ratish Puduppully and Vivek Raghavan and Pratyush Kumar and Mitesh M. Khapra and Raj Dabre and Anoop Kunchukuttan},
  year    = {2023},
  journal = {arXiv preprint arXiv: 2305.16307}
}
```