U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts

Document Layout Analysis, which is the task of identifying different semanticregions inside of a document page, is a subject of great interest for bothcomputer scientists and humanities scholars as it represents a fundamental steptowards further analysis tasks for the former and a powerful tool to improveand facilitate the study of the documents for the latter. However, many of theworks currently present in the literature, especially when it comes to theavailable datasets, fail to meet the needs of both worlds and, in particular,tend to lean towards the needs and common practices of the computer scienceside, leading to resources that are not representative of the humanities realneeds. For this reason, the present paper introduces U-DIADS-Bib, a novel,pixel-precise, non-overlapping and noiseless document layout analysis datasetdeveloped in close collaboration between specialists in the fields of computervision and humanities. Furthermore, we propose a novel, computer-aided,segmentation pipeline in order to alleviate the burden represented by thetime-consuming process of manual annotation, necessary for the generation ofthe ground truth segmentation maps. Finally, we present a standardized few-shotversion of the dataset (U-DIADS-BibFS), with the aim of encouraging thedevelopment of models and solutions able to address this task with as fewsamples as possible, which would allow for more effective use in a real-worldscenario, where collecting a large number of segmentations is not alwaysfeasible.