Arab Bart
Implemented the BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
paper from scratch using PyTorch
for an abstractive summarization task in Arabic.
The model inferenc is not ready, i mean you can't loading it directly from the
Transformers
library.As soon as possible i will create an inference API, and integrate the model with the Transformers library.
Goal
Reproduce the BART model from scratch to understand its architecture in depth, using the minimum available resources.
Size
The model size: 174M parameters
.
Task
Abstractive Summarization in Arabic.
Data
The dataset used is the XL-Sum(Arabic Subset) dataset. I chose this dataset because it's well-suited for our task. Additionally, it's written in pure Arabic, which makes it the best choice. The original source: BBC Arabic.
Features (columns):
- text: the full text (source sequences).
- summary: the summary of the text (target sequences).
Size:
- train:
32,473 rows
. - validation:
4689 rows
. - test:
4689 rows
.
- train:
Results
Epoch | Loss(train) | Loss(validation) | Epoch Time (hours) | Training Time (hours) | Device |
---|---|---|---|---|---|
1 | 10.03 | 9.72 | 0.23 | 1.1 | 1 x L4OS |
2 | 9.61 | 9.44 | 0.22 | 1.1 | 1 x L4OS |
3 | 9.36 | 9.22 | 0.22 | 1.1 | 1 x L4OS |
4 | 9.16 | 9.05 | 0.22 | 1.1 | 1 x L4OS |
5 | 9.01 | 8.92 | 0.22 | 1.1 | 1 x L4OS |
License
This model is licensed under the MIT
License.
- Downloads last month
- 22