DILHTWD commited on
Commit
d88fc0e
1 Parent(s): b95a6e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -1
README.md CHANGED
@@ -6,4 +6,61 @@ metrics:
6
  - precision
7
  - recall
8
  - f1
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - precision
7
  - recall
8
  - f1
9
+ ---
10
+
11
+ ## Model Description
12
+
13
+ This model was developed to address the challenges of Document Layout Segmentation and Document Layout Analysis by accurately segmenting a document page into its core components. These components include the title, captions, footnotes, formulas, list items, page footers, page headers, and pictures. The motivation behind creating this model stems from the need to enhance the understanding and accessibility of document content, facilitating a wide range of applications such as automated content extraction, document summarization, and improved accessibility features. By providing precise segmentation of these elements, the model aims to support various downstream tasks that rely on the structural understanding of document layouts, enabling more efficient and effective processing and analysis of document content.
14
+
15
+ ##Training Data:
16
+ - **Source:** DocLayNet, IBM (https://github.com/DS4SD/DocLayNet)
17
+ - **Classes:** 11 classes (Caption, Footnote, Formula, List-item, Page-footer, Page-header, Picture, Section-header, Table, Text, and Title)
18
+ - **Pages:** 80,863 document pages
19
+
20
+ ##Performance
21
+ Metrics:
22
+ - **Precision:** 0.98
23
+ - **Recall:** 0.97
24
+ - **F1:** 0.97
25
+ - **mAP50:** 0.99
26
+ - **mAP50-95:** 0.95
27
+
28
+ ## Usage
29
+
30
+ ### Example Code
31
+
32
+ To use the model, follow this example code:
33
+
34
+ ```python
35
+ from ultralytics import YOLO
36
+ from PIL import Image, ImageDraw
37
+ import pathlib
38
+
39
+ # List of sample images to process
40
+ img_list = ['sample1.png', 'sample2.png', 'sample3.png']
41
+
42
+ # Load the document segmentation model
43
+ docseg_model = YOLO('yolov8x-doclaynetcore-imgsz640-lr1.0000e-02.pt')
44
+
45
+ # Process the images with the model
46
+ results = docseg_model(source=img_list, save=True, show_labels=True, show_conf=True, show_boxes=True)
47
+
48
+ # Initialize a dictionary to store results
49
+ mydict = {}
50
+
51
+ # Extract and store the paths and coordinates of detected components
52
+ for entry in results:
53
+ thepath = pathlib.Path(entry.path)
54
+ thecoords = entry.boxes.xyxy.numpy()
55
+ mydict.update({thepath: thecoords})
56
+ ```
57
+
58
+ ## Model Details
59
+ - **Model Name:** DILHTWD/documentlayoutsegmentation_YOLOv8_ondoclaynet
60
+ - **Publisher:** Data Intelligence Lab, Hochschule für Technik und Wirtschaft Dresdem
61
+ - **Model Version:** 1.0.0
62
+ - **Model Date:** 2024-03-17
63
+ - **License:** [AGPL-3.0](https://www.gnu.org/licenses/agpl-3.0.de.html)
64
+ - **Architecture:** YOLOv8 (https://github.com/ultralytics/ultralytics)
65
+ - **Task:** Document Layout Segmentation, Document Layout Analysis
66
+