Papers
arxiv:2309.17033

Unveiling Document Structures with YOLOv5 Layout Detection

Published on Sep 29, 2023
Authors:
,

Abstract

The current digital environment is characterized by the widespread presence of data, particularly unstructured data, which poses many issues in sectors including finance, healthcare, and education. Conventional techniques for data extraction encounter difficulties in dealing with the inherent variety and complexity of unstructured data, hence requiring the adoption of more efficient methodologies. This research investigates the utilization of YOLOv5, a cutting-edge computer vision model, for the purpose of rapidly identifying document layouts and extracting unstructured data. The present study establishes a conceptual framework for delineating the notion of "objects" as they pertain to documents, incorporating various elements such as paragraphs, tables, photos, and other constituent parts. The main objective is to create an autonomous system that can effectively recognize document layouts and extract unstructured data, hence improving the effectiveness of data extraction. In the conducted examination, the YOLOv5 model exhibits notable effectiveness in the task of document layout identification, attaining a high accuracy rate along with a precision value of 0.91, a recall value of 0.971, an F1-score of 0.939, and an area under the receiver operating characteristic curve (AUC-ROC) of 0.975. The remarkable performance of this system optimizes the process of extracting textual and tabular data from document images. Its prospective applications are not limited to document analysis but can encompass unstructured data from diverse sources, such as audio data. This study lays the foundation for future investigations into the wider applicability of YOLOv5 in managing various types of unstructured data, offering potential for novel applications across multiple domains.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.17033 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.17033 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.17033 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.