CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

Formalizing surgical activities as triplets of the used instruments, actionsperformed, and target anatomies is becoming a gold standard approach forsurgical activity modeling. The benefit is that this formalization helps toobtain a more detailed understanding of tool-tissue interaction which can beused to develop better Artificial Intelligence assistance for image-guidedsurgery. Earlier efforts and the CholecTriplet challenge introduced in 2021have put together techniques aimed at recognizing these triplets from surgicalfootage. Estimating also the spatial locations of the triplets would offer amore precise intraoperative context-aware decision support forcomputer-assisted intervention. This paper presents the CholecTriplet2022challenge, which extends surgical action triplet modeling from recognition todetection. It includes weakly-supervised bounding box localization of everyvisible surgical instrument (or tool), as the key actors, and the modeling ofeach tool-activity in the form of triplet. The paperdescribes a baseline method and 10 new deep learning algorithms presented atthe challenge to solve the task. It also provides thorough methodologicalcomparisons of the methods, an in-depth analysis of the obtained results acrossmultiple metrics, visual and procedural challenges; their significance, anduseful insights for future research directions and applications in surgery.