--- license: cc-by-nc-sa-4.0 tags: - generated_from_trainer datasets: - funsd-layoutlmv3 metrics: - precision - recall - f1 - accuracy model-index: - name: OCR-LayoutLMv3 results: - task: name: Token Classification type: token-classification dataset: name: funsd-layoutlmv3 type: funsd-layoutlmv3 config: funsd split: train args: funsd metrics: - name: Precision type: precision value: 0.8988653182042428 - name: Recall type: recall value: 0.905116741182315 - name: F1 type: f1 value: 0.9019801980198019 - name: Accuracy type: accuracy value: 0.8403661000832046 --- # OCR-LayoutLMv3 This model is a fine-tuned version of [microsoft/layoutlmv3-base](https://huggingface.co/microsoft/layoutlmv3-base) on the funsd-layoutlmv3 dataset. It achieves the following results on the evaluation set: - Loss: 0.9788 - Precision: 0.8989 - Recall: 0.9051 - F1: 0.9020 - Accuracy: 0.8404 ## Model description LayoutLMv3 is a pre-trained multimodal Transformer for Document AI with unified text and image masking. The simple unified architecture and training objectives make LayoutLMv3 a general-purpose pre-trained model. For example, LayoutLMv3 can be fine-tuned for both text-centric tasks, including form understanding, receipt understanding, and document visual question answering, and image-centric tasks such as document image classification and document layout analysis. [LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking](https://arxiv.org/abs/2204.08387) Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei, Preprint 2022. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - training_steps: 2000 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:| | No log | 1.33 | 100 | 0.6966 | 0.7418 | 0.8063 | 0.7727 | 0.7801 | | No log | 2.67 | 200 | 0.5767 | 0.8104 | 0.8644 | 0.8365 | 0.8117 | | No log | 4.0 | 300 | 0.5355 | 0.8246 | 0.8852 | 0.8539 | 0.8295 | | No log | 5.33 | 400 | 0.5240 | 0.8706 | 0.8922 | 0.8813 | 0.8427 | | 0.5326 | 6.67 | 500 | 0.6337 | 0.8528 | 0.8778 | 0.8651 | 0.8260 | | 0.5326 | 8.0 | 600 | 0.6870 | 0.8698 | 0.8828 | 0.8762 | 0.8240 | | 0.5326 | 9.33 | 700 | 0.6584 | 0.8723 | 0.9061 | 0.8889 | 0.8342 | | 0.5326 | 10.67 | 800 | 0.7186 | 0.8868 | 0.9031 | 0.8949 | 0.8335 | | 0.5326 | 12.0 | 900 | 0.6822 | 0.9040 | 0.9076 | 0.9058 | 0.8526 | | 0.1248 | 13.33 | 1000 | 0.7042 | 0.8872 | 0.9021 | 0.8946 | 0.8511 | | 0.1248 | 14.67 | 1100 | 0.7920 | 0.9027 | 0.9036 | 0.9032 | 0.8480 | | 0.1248 | 16.0 | 1200 | 0.8052 | 0.8964 | 0.9151 | 0.9056 | 0.8389 | | 0.1248 | 17.33 | 1300 | 0.8932 | 0.8995 | 0.9066 | 0.9030 | 0.8329 | | 0.1248 | 18.67 | 1400 | 0.8728 | 0.8950 | 0.9061 | 0.9005 | 0.8398 | | 0.0442 | 20.0 | 1500 | 0.9051 | 0.8960 | 0.9116 | 0.9037 | 0.8347 | | 0.0442 | 21.33 | 1600 | 0.9587 | 0.8947 | 0.9031 | 0.8989 | 0.8401 | | 0.0442 | 22.67 | 1700 | 0.9822 | 0.9042 | 0.9046 | 0.9044 | 0.8389 | | 0.0442 | 24.0 | 1800 | 0.9734 | 0.9043 | 0.9061 | 0.9052 | 0.8391 | | 0.0442 | 25.33 | 1900 | 0.9842 | 0.9042 | 0.9091 | 0.9066 | 0.8410 | | 0.0225 | 26.67 | 2000 | 0.9788 | 0.8989 | 0.9051 | 0.9020 | 0.8404 | ### Framework versions - Transformers 4.25.0.dev0 - Pytorch 1.12.1 - Datasets 2.6.1 - Tokenizers 0.13.1