Towards Making Flowchart Images Machine Interpretable

Computer programming textbooks and software documentations often containflowcharts to illustrate the flow of an algorithm or procedure. Modern OCRengines often tag these flowcharts as graphics and ignore them in furtherprocessing. In this paper, we work towards making flowchart imagesmachine-interpretable by converting them to executable Python codes. To thisend, inspired by the recent success in natural language to code generationliterature, we present a novel transformer-based framework, namely FloCo-T5.Our model is well-suited for this task,as it can effectively learn semantics,structure, and patterns of programming languages, which it leverages togenerate syntactically correct code. We also used a task-specific pre-trainingobjective to pre-train FloCo-T5 using a large number of logic-preservingaugmented code samples. Further, to perform a rigorous study of this problem,we introduce theFloCo dataset that contains 11,884 flowchart images and theircorresponding Python codes. Our experiments show promising results, andFloCo-T5 clearly outperforms related competitive baselines on code generationmetrics. We make our dataset and implementation publicly available.