AdamCodd
/

donut-receipts-extract

vision-encoder-decoder

Model card Files Files and versions Community

AdamCodd commited on Apr 11

Commit

3ed5c68

•

1 Parent(s): b20b842

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ pipeline_tag: image-to-text
 Donut model was introduced in the paper [OCR-free Document Understanding Transformer](https://arxiv.org/abs/2111.15664) by Geewok et al. and first released in [this repository](https://github.com/clovaai/donut).
-## V2
 This model has been retrained on an improved version of the [AdamCodd/donut-receipts](https://huggingface.co/datasets/AdamCodd/donut-receipts) dataset (deduplicated, manually corrected). The new license for the V2 model is **cc-by-nc-4.0**. For commercial use rights, please [contact me](https://discord.com/users/859202914400075798). Meanwhile, the V1 model remains available under the MIT license (under v1 branch).
@@ -56,7 +56,7 @@ The task_prompt has been changed to ``<s_receipt>`` for the V2 (previously ``<s_
 The V2 performs way better than the V1 as it has been trained on twice the resolution for the receipts, using a better dataset. Despite that, it's not perfect due to a lack of diverse receipts (the training dataset is still ~1100 receipts); for a future version, that will be the main focus.
-## V1
 This model is a finetune of the [donut base model](https://huggingface.co/naver-clova-ix/donut-base/) on the [AdamCodd/donut-receipts](https://huggingface.co/datasets/AdamCodd/donut-receipts) dataset. Its purpose is to efficiently extract text from receipts.

 Donut model was introduced in the paper [OCR-free Document Understanding Transformer](https://arxiv.org/abs/2111.15664) by Geewok et al. and first released in [this repository](https://github.com/clovaai/donut).
+## === V2 ===
 This model has been retrained on an improved version of the [AdamCodd/donut-receipts](https://huggingface.co/datasets/AdamCodd/donut-receipts) dataset (deduplicated, manually corrected). The new license for the V2 model is **cc-by-nc-4.0**. For commercial use rights, please [contact me](https://discord.com/users/859202914400075798). Meanwhile, the V1 model remains available under the MIT license (under v1 branch).
 The V2 performs way better than the V1 as it has been trained on twice the resolution for the receipts, using a better dataset. Despite that, it's not perfect due to a lack of diverse receipts (the training dataset is still ~1100 receipts); for a future version, that will be the main focus.
+## === V1 ====
 This model is a finetune of the [donut base model](https://huggingface.co/naver-clova-ix/donut-base/) on the [AdamCodd/donut-receipts](https://huggingface.co/datasets/AdamCodd/donut-receipts) dataset. Its purpose is to efficiently extract text from receipts.