Abstract
We explore the feasibility and potential of building a ground-truth-free evaluation model to assess the quality of segmentations generated by the Segment Anything Model (SAM) and its variants in medical imaging. This evaluation model estimates segmentation quality scores by analyzing the coherence and consistency between the input images and their corresponding segmentation predictions. Based on prior research, we frame the task of training this model as a regression problem within a supervised learning framework, using Dice scores (and optionally other metrics) along with mean squared error to compute the training loss. The model is trained utilizing a large collection of public datasets of medical images with segmentation predictions from SAM and its variants. We name this model EvanySeg (Evaluation of Any Segmentation in Medical Images). EvanySeg can be employed for various tasks, including: (1) identifying poorly segmented samples by detecting low-percentile segmentation quality scores; (2) benchmarking segmentation models without ground truth by averaging quality scores across test samples; (3) alerting human experts to poor-quality segmentation predictions during human-AI collaboration by applying a threshold within the score space; and (4) selecting the best segmentation prediction for each test sample at test time when multiple segmentation models are available, by choosing the prediction with the highest quality score. Models and code are available at https://github.com/ahjolsenbics/Evanyseg.
| Original language | English |
|---|---|
| Title of host publication | Pattern Recognition and Computer Vision |
| Subtitle of host publication | 8th Chinese Conference, PRCV 2025, Shanghai, China, October 15–18, 2025, Proceedings, Part XIII |
| Editors | Josef Kittler, Hongkai Xiong, Jian Yang, Xilin Chen, Jiwen Lu, Weiyao Lin, Jingyi Yu, Weishi Zheng |
| Place of Publication | Singapore |
| Publisher | Springer |
| Pages | 238-253 |
| Number of pages | 16 |
| Edition | 1st |
| ISBN (Electronic) | 9789819556342 |
| ISBN (Print) | 9789819556335 |
| DOIs | |
| Publication status | Published - 20 Jan 2026 |
| Event | 8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 - Shanghai, China Duration: 15 Oct 2025 → 18 Oct 2025 http://www.prcv.cn/en/ (Conference Website) https://link.springer.com/book/10.1007/978-981-95-5761-5 (Conference Proceedings) |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 16284 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
| Name | PRCV: Chinese Conference on Pattern Recognition and Computer Vision |
|---|
Conference
| Conference | 8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 |
|---|---|
| Country/Territory | China |
| City | Shanghai |
| Period | 15/10/25 → 18/10/25 |
| Internet address |
|
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
User-Defined Keywords
- Ground-truth-free Segmentation Evaluation
- Quality Assessment
- Medical Image Segmentation
- Foundation Model for Trustworthy Medical AI
Fingerprint
Dive into the research topics of 'Coherence-Based Segmentation Quality Evaluator Trained on a Large Collection of Annotated Medical Images'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver