TY - JOUR
T1 - ESIP: Explicit Surgical Instrument Prompting for Surgical Workflow Recognition
AU - Qiu, Yixuan
AU - Liu, Mengxing
AU - He, Siyuan
AU - Zhou, Guangquan
AU - Lyu, Fei
AU - Chen, Yang
AU - Zhou, Ping
N1 - This work was supported by the National Natural Science Foundation of China (No. 62371121).
Publisher Copyright:
© 2013 IEEE.
PY - 2025/10/27
Y1 - 2025/10/27
N2 - Surgical workflow recognition (SWR) stands as a pivotal component in computer-assisted surgery and is dedicated to identifying phases from surgical videos. Many deep learning-based methods have been proposed for this task and achieved acceptable SWR results. However, these methods usually implicitly extract and aggregate spatio-temporal features, so that it is challenging for these methods to adequately use some spatial information that is strongly relevant to surgical phase in SWR task, such as the information from the surgical instruments. To address this issue, an Explicit Surgical Instrument Prompting (ESIP) approach is proposed for SWR task. ESIP leverages surgical instrument segmentation to generate instrument-specific visual prompts, which explicitly guide the extraction of crucial intra-frame spatial features through a frozen pre-trained backbone, then enable effective inter-frame spatio-temporal feature extraction and aggregation. Unlike multi-task approaches that jointly perform SWR with auxiliary tasks within a shared network framework, ESIP is a single-task SWR approach dedicated to optimize framework itself for more adequate feature extraction. Furthermore, to accomplish the segmentation prompting efficiently, this paper presents SAM-based segmentation with prompt tuning strategy to explicitly integrate segmentation features into spatial features. Experimental results on Cholec80, M2CAI and AutoLaparo datasets demonstrate that our ESIP method achieves the best performance in comparison with 16 SOTA methods, with a Precision of 91.8%, 89.5% and 89.6 %, Recall of 92.2%, 89.5% and 76.9 %, Jaccard of 83.3%, 77.0% and 67.3 %, respectively.
AB - Surgical workflow recognition (SWR) stands as a pivotal component in computer-assisted surgery and is dedicated to identifying phases from surgical videos. Many deep learning-based methods have been proposed for this task and achieved acceptable SWR results. However, these methods usually implicitly extract and aggregate spatio-temporal features, so that it is challenging for these methods to adequately use some spatial information that is strongly relevant to surgical phase in SWR task, such as the information from the surgical instruments. To address this issue, an Explicit Surgical Instrument Prompting (ESIP) approach is proposed for SWR task. ESIP leverages surgical instrument segmentation to generate instrument-specific visual prompts, which explicitly guide the extraction of crucial intra-frame spatial features through a frozen pre-trained backbone, then enable effective inter-frame spatio-temporal feature extraction and aggregation. Unlike multi-task approaches that jointly perform SWR with auxiliary tasks within a shared network framework, ESIP is a single-task SWR approach dedicated to optimize framework itself for more adequate feature extraction. Furthermore, to accomplish the segmentation prompting efficiently, this paper presents SAM-based segmentation with prompt tuning strategy to explicitly integrate segmentation features into spatial features. Experimental results on Cholec80, M2CAI and AutoLaparo datasets demonstrate that our ESIP method achieves the best performance in comparison with 16 SOTA methods, with a Precision of 91.8%, 89.5% and 89.6 %, Recall of 92.2%, 89.5% and 76.9 %, Jaccard of 83.3%, 77.0% and 67.3 %, respectively.
KW - Prompt Engineering
KW - Surgical Instrument Segmentation
KW - Surgical Workflow Analysis
KW - Surgical Workflow Recognition
UR - https://www.scopus.com/pages/publications/105020278693
UR - https://ieeexplore.ieee.org/document/11218015
U2 - 10.1109/JBHI.2025.3625420
DO - 10.1109/JBHI.2025.3625420
M3 - Journal article
C2 - 41144421
AN - SCOPUS:105020278693
SN - 2168-2194
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
ER -