HugoSchtr commited on
Commit
344bee7
1 Parent(s): ec352c4

readme updated

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ tags:
4
+ - yolov5
5
+ - yolo
6
+ - digital-humanities
7
+ - object-detection
8
+ - computer-vision
9
+ - document-layout-analysis
10
  ---
11
+
12
+ # What's YOLOv5
13
+
14
+ YOLOv5 is an open-source object detection model released by [Ultralytics](https://ultralytics.com/), on [Github](https://github.com/ultralytics/yolov5).
15
+
16
+ # DataCatalogue (or DataCat)
17
+
18
+ (DataCatalogue)[https://github.com/DataCatalogue] is a research projet jointly led by Inria, the Bibliothèque nationale de France (National Library of France) and the Institut national d'histoire de l'art (National Institute of Art History).
19
+
20
+ It aims at restructuring OCR-ed auction sale catalogs kept in France national collections into TEI-XML, using machine learning solutions.
21
+
22
+ # DataCat Yolov5
23
+
24
+ We trained a YOLOv5 model on custom data to perform document layout analysis on auction sale catalogs.
25
+
26
+ The training set consists of **581 images**, annotated with **two classes**:
27
+ * *title* (585 instances)
28
+ * *entry* (it refers to a catalog entry) (5017 instances)
29
+
30
+ 59 images were used for validation.
31
+
32
+ We reached:
33
+ | precision | recall | mAP_0.5 | mAP_0.5:0.95 |
34
+ |---|---|---|---|
35
+ | 0.99 | 0.99 | 0.98 | 0.75 |
36
+
37
+ # Dataset
38
+
39
+ The dataset is not released for the moment.
40
+
41
+ ## Demo
42
+
43
+ An interactive demo is available on the following HugginFace Space: https://huggingface.co/spaces/HugoSchtr/DataCat_Yolov5
44
+
45
+ ## What's next
46
+
47
+ The model performs well on our data and now needs to be incorporated into a dedicated pipeline for the research project.
48
+
49
+ We also plan to train a new model on a larger training set in the near future.