Dissecting self-supervised learning methods for surgical computer vision

Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy

Research output: Contribution to journalArticlepeer-review

15 Citations (SciVal)

Abstract

The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun to gain traction in the general computer vision community, represent a potential solution to these annotation costs, allowing to learn useful representations from only unlabeled data. Still, the effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. In this work, we address this critical need by investigating four state-of-the-art SSL methods (MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection. We examine their parameterization, then their behavior with respect to training data quantities in semi-supervised settings. Correct transfer of these methods to surgery, as described and conducted in this work, leads to substantial performance gains over generic uses of SSL – up to 7.4% on phase recognition and 20% on tool presence detection – as well as state-of-the-art semi-supervised phase recognition approaches by up to 14%. Further results obtained on a highly diverse selection of surgical datasets exhibit strong generalization properties. The code is available at https://github.com/CAMMA-public/SelfSupSurg.

Original languageEnglish
Article number102844
JournalMedical Image Analysis
Volume88
Early online date24 May 2023
DOIs
Publication statusPublished - 1 Aug 2023

Bibliographical note

Funding Information:
This work was partially supported by French state funds managed by the ANR under references ANR-20-CHIA-0029-01 (National AI Chair AI4ORSafety), ANR-10-IAHU-02 (IHU Strasbourg) and ANR-16-CE33-0009 (DeepSurg). This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 813782 - project ATLAS. This work was supported by a Ph.D. fellowship from Intuitive Surgical. It was granted access to the HPC resources of IDRIS under the allocations 2021-AD011011638R1, 2021-AD011011638R2, 2021-AD011012715, 2021-AD011012832, 2021-AD011011507R1, and 2021-AD011011640R1. For evaluation on the HeiChole dataset, we thank Dr. Sebastian Bodenstedt for the timely support.

Funding Information:
This work was partially supported by French state funds managed by the ANR under references ANR-20-CHIA-0029-01 (National AI Chair AI4ORSafety), ANR-10-IAHU-02 (IHU Strasbourg) and ANR-16-CE33-0009 (DeepSurg). This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 813782 - project ATLAS. This work was supported by a Ph.D. fellowship from Intuitive Surgical . It was granted access to the HPC resources of IDRIS under the allocations 2021-AD011011638R1 , 2021-AD011011638R2 , 2021-AD011012715 , 2021-AD011012832 , 2021-AD011011507R1 , and 2021-AD011011640R1 . For evaluation on the HeiChole dataset, we thank Dr. Sebastian Bodenstedt for the timely support.

Publisher Copyright:
© 2023 Elsevier B.V.

Funding

This work was partially supported by French state funds managed by the ANR under references ANR-20-CHIA-0029-01 (National AI Chair AI4ORSafety), ANR-10-IAHU-02 (IHU Strasbourg) and ANR-16-CE33-0009 (DeepSurg). This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 813782 - project ATLAS. This work was supported by a Ph.D. fellowship from Intuitive Surgical. It was granted access to the HPC resources of IDRIS under the allocations 2021-AD011011638R1, 2021-AD011011638R2, 2021-AD011012715, 2021-AD011012832, 2021-AD011011507R1, and 2021-AD011011640R1. For evaluation on the HeiChole dataset, we thank Dr. Sebastian Bodenstedt for the timely support. This work was partially supported by French state funds managed by the ANR under references ANR-20-CHIA-0029-01 (National AI Chair AI4ORSafety), ANR-10-IAHU-02 (IHU Strasbourg) and ANR-16-CE33-0009 (DeepSurg). This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 813782 - project ATLAS. This work was supported by a Ph.D. fellowship from Intuitive Surgical . It was granted access to the HPC resources of IDRIS under the allocations 2021-AD011011638R1 , 2021-AD011011638R2 , 2021-AD011012715 , 2021-AD011012832 , 2021-AD011011507R1 , and 2021-AD011011640R1 . For evaluation on the HeiChole dataset, we thank Dr. Sebastian Bodenstedt for the timely support.

Keywords

  • Deep learning
  • Endoscopic videos
  • Laparoscopic cholecystectomy
  • Self-supervised learning
  • Semi-supervised learning
  • Surgical computer vision

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Vision and Pattern Recognition
  • Health Informatics
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Dissecting self-supervised learning methods for surgical computer vision'. Together they form a unique fingerprint.

Cite this