Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -2,8 +2,6 @@
|
|
| 2 |
license: mit
|
| 3 |
base_model:
|
| 4 |
- google/efficientnet-b0
|
| 5 |
-
datasets:
|
| 6 |
-
- docling-project/HF-CC-v0-00001-00010-images-filtered-new-class
|
| 7 |
tags:
|
| 8 |
- image-classification
|
| 9 |
- document-analysis
|
|
@@ -13,7 +11,7 @@ tags:
|
|
| 13 |
|
| 14 |
# EfficientNet-B0 Document Figure Classifier v2.5
|
| 15 |
|
| 16 |
-
This is an image classification model based on **Google EfficientNet-B0**, fine-tuned on a subset of the
|
| 17 |
|
| 18 |
1. **logo**
|
| 19 |
2. **photograph**
|
|
@@ -59,34 +57,34 @@ The model was evaluated on a held-out test set from the finepdfs dataset with th
|
|
| 59 |
|
| 60 |
### Per-Label Performance
|
| 61 |
|
| 62 |
-
| Label | Precision | Recall |
|
| 63 |
-
|-------|-----------|--------|
|
| 64 |
-
| **logo** | 0.92807 | 0.91816 |
|
| 65 |
-
| **photograph** | 0.90966 | 0.96029 |
|
| 66 |
-
| **icon** | 0.83605 | 0.82678 |
|
| 67 |
-
| **engineering_drawing** | 0.71689 | 0.81172 |
|
| 68 |
-
| **line_chart** | 0.73055 | 0.92117 |
|
| 69 |
-
| **bar_chart** | 0.88599 | 0.92720 |
|
| 70 |
-
| **other** | 0.41893 | 0.38213 |
|
| 71 |
-
| **table** | 0.98636 | 0.96765 |
|
| 72 |
-
| **flow_chart** | 0.75926 | 0.82425 |
|
| 73 |
-
| **screenshot_from_computer** | 0.85952 | 0.71980 |
|
| 74 |
-
| **signature** | 0.89020 | 0.85971 |
|
| 75 |
-
| **screenshot_from_manual** | 0.48559 | 0.34543 |
|
| 76 |
-
| **geographical_map** | 0.86780 | 0.85219 |
|
| 77 |
-
| **pie_chart** | 0.96880 | 0.94220 |
|
| 78 |
-
| **page_thumbnail** | 0.52008 | 0.35188 |
|
| 79 |
-
| **stamp** | 0.71269 | 0.41794 |
|
| 80 |
-
| **music** | 0.48037 | 0.57778 |
|
| 81 |
-
| **calendar** | 0.52880 | 0.28775 |
|
| 82 |
-
| **qr_code** | 0.95694 | 0.93240 |
|
| 83 |
-
| **bar_code** | 0.34244 | 0.84305 |
|
| 84 |
-
| **full_page_image** | 0.40323 | 0.65789 |
|
| 85 |
-
| **scatter_plot** | 0.66848 | 0.67213 |
|
| 86 |
-
| **chemistry_structure** | 0.72781 | 0.65426 |
|
| 87 |
-
| **topographical_map** | 0.83333 | 0.38462 |
|
| 88 |
-
| **crossword_puzzle** | 0.57143 | 0.21622 |
|
| 89 |
-
| **box_plot** | 0.85714 | 0.64286 |
|
| 90 |
|
| 91 |
|
| 92 |
## How to use - Transformers
|
|
@@ -238,7 +236,7 @@ for item in ort_session.run(None, {'input': onnx_inputs}):
|
|
| 238 |
|
| 239 |
## Training Data
|
| 240 |
|
| 241 |
-
This model was trained on a subset of the
|
| 242 |
|
| 243 |
|
| 244 |
## Citation
|
|
|
|
| 2 |
license: mit
|
| 3 |
base_model:
|
| 4 |
- google/efficientnet-b0
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
- image-classification
|
| 7 |
- document-analysis
|
|
|
|
| 11 |
|
| 12 |
# EfficientNet-B0 Document Figure Classifier v2.5
|
| 13 |
|
| 14 |
+
This is an image classification model based on **Google EfficientNet-B0**, fine-tuned on a subset of the subset of HuggingFace/finepdfs to classify document figures into one of the following 26 categories:
|
| 15 |
|
| 16 |
1. **logo**
|
| 17 |
2. **photograph**
|
|
|
|
| 57 |
|
| 58 |
### Per-Label Performance
|
| 59 |
|
| 60 |
+
| Label | Precision (v2.5) | Recall (v2.5) | Precision (v2.0) | Recall (v2.0) |
|
| 61 |
+
|-------|------------------|---------------|------------------|---------------|
|
| 62 |
+
| **logo** | 0.92807 | 0.91816 | 0.88317 | 0.88728 |
|
| 63 |
+
| **photograph** | 0.90966 | 0.96029 | 0.88169 | 0.93359 |
|
| 64 |
+
| **icon** | 0.83605 | 0.82678 | 0.79281 | 0.72133 |
|
| 65 |
+
| **engineering_drawing** | 0.71689 | 0.81172 | 0.58795 | 0.71555 |
|
| 66 |
+
| **line_chart** | 0.73055 | 0.92117 | 0.75865 | 0.84576 |
|
| 67 |
+
| **bar_chart** | 0.88599 | 0.92720 | 0.72624 | 0.93883 |
|
| 68 |
+
| **other** | 0.41893 | 0.38213 | 0.28239 | 0.37312 |
|
| 69 |
+
| **table** | 0.98636 | 0.96765 | 0.97950 | 0.95250 |
|
| 70 |
+
| **flow_chart** | 0.75926 | 0.82425 | 0.61527 | 0.81518 |
|
| 71 |
+
| **screenshot_from_computer** | 0.85952 | 0.71980 | 0.80510 | 0.65844 |
|
| 72 |
+
| **signature** | 0.89020 | 0.85971 | 0.91852 | 0.80914 |
|
| 73 |
+
| **screenshot_from_manual** | 0.48559 | 0.34543 | 0.34748 | 0.20662 |
|
| 74 |
+
| **geographical_map** | 0.86780 | 0.85219 | 0.82959 | 0.80720 |
|
| 75 |
+
| **pie_chart** | 0.96880 | 0.94220 | 0.89903 | 0.93931 |
|
| 76 |
+
| **page_thumbnail** | 0.52008 | 0.35188 | 0.40194 | 0.21475 |
|
| 77 |
+
| **stamp** | 0.71269 | 0.41794 | 0.63492 | 0.26258 |
|
| 78 |
+
| **music** | 0.48037 | 0.57778 | 0.76955 | 0.51944 |
|
| 79 |
+
| **calendar** | 0.52880 | 0.28775 | 0.51176 | 0.24786 |
|
| 80 |
+
| **qr_code** | 0.95694 | 0.93240 | 0.97500 | 0.90909 |
|
| 81 |
+
| **bar_code** | 0.34244 | 0.84305 | 0.12087 | 0.82063 |
|
| 82 |
+
| **full_page_image** | 0.40323 | 0.65789 | 0.43750 | 0.28116 |
|
| 83 |
+
| **scatter_plot** | 0.66848 | 0.67213 | 0.60386 | 0.68306 |
|
| 84 |
+
| **chemistry_structure** | 0.72781 | 0.65426 | 0.77444 | 0.54787 |
|
| 85 |
+
| **topographical_map** | 0.83333 | 0.38462 | 0.68750 | 0.28205 |
|
| 86 |
+
| **crossword_puzzle** | 0.57143 | 0.21622 | 0.80000 | 0.21622 |
|
| 87 |
+
| **box_plot** | 0.85714 | 0.64286 | 1.00000 | 0.07143 |
|
| 88 |
|
| 89 |
|
| 90 |
## How to use - Transformers
|
|
|
|
| 236 |
|
| 237 |
## Training Data
|
| 238 |
|
| 239 |
+
This model was trained on a subset of the subset of HuggingFace/finepdfs, a large-scale dataset for document understanding tasks.
|
| 240 |
|
| 241 |
|
| 242 |
## Citation
|